Indexing Large Files in SharePoint 2010

Like it’s predecessor, SharePoint 2010 also doesn’t index files larger than 16 MB by default. This is because the search service application property ‘MaxDownloadSize’ is set to 16 MB by default. We can change this by running the following PowerShell commands and restarting search service.

$dSize = Get-SPEnterpriseSearchServiceApplication;
$dSize.SetProperty("MaxDownloadSize", 32);

The above commands will increase the max size of files that SharePoint can index to 32 MB from 16 MB (default).

To restart search service, run the following command on the indexing server.

Restart-Service osearch14;

Incorrect search result title for PDF documents in SharePoint 2010 search

Recently we ran into an issue similar to what described in this post but with PDF documents. The SharePoint metadata “Title” is empty for these documents but the search results show some random text (which does not even present in the document) instead of document name. We have already implemented the fix for Word 2007 title issue described in the above mentioned post. On further investigating, it’s found that this behavior is due to the document’s internal property “Title” (not the document library metadata column “Title”). This title is set while generating the PDF files from a PDF creator/editor software. If SharePoint metadata “Title” is empty for the document, SharePoint crawler will consider the document’s internal property “Title” for the search result title. This can be verified by downloading the documents having incorrect title in search results and navigating to File > Properties after opening it in Adobe Reader. You can find the “Title” property is set to the text which is displayed in search results as title. So, I guess, the crawler considers text for search result title in the order of SharePoint metadata “Title”, document specific property “Title” and then file name if the first two are empty. I haven’t found a fix or workaround to change this order. The best option would be encouraging users to make good use of SharePoint metadata while sharing/collaborating on documents. Populating the SharePoint metadata associated with documents (especially “title”) will improve visibility of these documents in search.

Are List Attachments Indexed in SharePoint Search?

Recently one of my colleagues brought into my attention that list attachments were not getting indexed and hence not showing in search results. We did some research and tests together and it is found that attachments are actually getting indexed and can be searched! I’ve seen a lot of people asking the same question in various blogs. I guess the confusion arouse because of the way SharePoint shows the search result for attachment. In the search results, SharePoint doesn’t really provide a direct link to the attachment. Instead, it displays link to the list item containing the attachment. But in the description of the search result, you can see the words from the attachment that you searched for. This is a bit disappointing because the search result would be confusing and useless if the list item has multiple attachments. The user won’t know which attachment contains the words he searched for.