Why the Outlook search technically cannot work at all

A pragmatic explanatory approach

Repost my article, which was first published at ITK SECURIT

Last update 1 year ago by Patrick Ruppelt

Reading time: 12 Minutes

Welcome to this video in which I try to explain why Outlook search can never work as users would expect it to.My name is Patrick Ruppelt, I have been working with IT systems for over 25 years and I can tell you that the complaints about Outlook search not working just won't stop. With every new customer we win, we are told that the Outlook search is not working properly and that it was the previous IT manager's problem. But most of the time it's not their fault at all. The Outlook search has a very conceptual problem.In this video I venture an explanatory approach that is primarily intended for non-technical people. It will not solve your problems with the Outlook search, but perhaps it can help you to understand why everything is the way it is and, of course, I will also show you a way how you can really search reliably and quickly in all messages in Outlook.Further below, as usual, you will find a transcript of this video to be able to read individual details again if you are interested.

Let's start with an example. First I start Outlook and search for a search term from a current message. Let's take the search term "apple" and refine the whole thing by only searching for messages from "Gymondo" that contain the search term.

The Outlook search returns incomplete results

If we use the Outlook search, we actually find the message we are looking for. However, only this one message and the hint that we should click on "Next" again for a complete search.

No sooner said than done, nothing much changes. Only the hint that the search is now complete and if something is missing, then we should refine our search. So according to Microsoft logic, it's the user's fault if he doesn't find anything.

No sooner said than done, nothing much changes. Only the hint that the search is now complete and if something is missing, then we should refine our search. So according to Microsoft logic, it's the user's fault if he doesn't find anything.

The MailStore search in the email archive is complete

But I dare say that this cannot be the only message from "Gymnodo" in my mailbox with the word "apple" in it, and I don't think it was because of my search terms either.

So let's try a second way. The search via our email archive. Lo and behold, our MailStore search returns completely different results. Including this three-year-old message here, which Outlook has apparently long since forgotten.

So we see, the Outlook search results list is more than incomplete.

But why is that?

How searching in databases works

To understand this, let's take a brief dive into the topic of databases.

Database developers may forgive me for explaining this in a very simplified way. I am not training developers here, but would like to illustrate for the layman what the fundamental difference is between a cleanly programmed database and the way Outlook confusingly stores its data.

I connect directly to one of our database servers. Here I open an accounting database and look at which accounting accounts are entered in the table "accounts".

Without knowing what everything means exactly, we recognise a clear structure of the data.

In the data record for the posting account "Technical literature" with the account number 4941, we see a so-called private key in addition to other details for this entry. Here in the example called pk_index with the consecutive number 27.

So, in order to do a targeted search in the database for all bookings in the account "Technical literature", I no longer have to search the contents of all items. Rather, I can specifically limit that I only want to search in the current bookings. This is done here by specifying where exactly I want to search.

Also, I no longer have to search all entries for the text term "technical literature". This is very time-consuming and can take several seconds, if not minutes. Instead, I simply filter the database according to the private key I found out above, or because I am searching in a foreign table, it is referred to here as a foreign key.

In the SQL language, it looks something like this:

SELECT * FROM rzacct.bookings WHERE fk_account LIKE 27;

In the statistics, we see that this search on the server took just 0.17 milliseconds and returned a complete list of results about exactly what we were looking for.

And at that point we compare this with how Outlook would search in its files.

The structure of Outlook files

To get to the bottom of the Outlook search, we first need to know how the email content is stored in Outlook in the first place: In one or more files on the hard disk. But not in a real database, but in the form of various formats that only Microsoft uses.

How the search is technically implemented also depends on a variety of factors1).

On the one hand, there is the distinction between whether the search runs locally or on the server. In the case of a local search, we speak of "WDS", which stands for "Windows Desktop Search". In the other case, experts talk about "Exchange Search".

Because all this is still not complicated enough, the search procedures differ depending on which Outlook version I use. Here in the Microsoft article, only the differences between Outlook 2010 and Outlook 2013 are listed. Unfortunately, I couldn't find a more up-to-date one.

If you now believe that the WDS search procedures are at least standardised on all computers, you are unfortunately far wrong. Because the Windows search also differs depending on the version of your basic Windows system.

If WDS is not solved in a uniform way, maybe it is better solved with the "Exchange Search" variant? Well, here at least the administrator has a few more possibilities to influence it.2). Provided you have your own Exchange Server.

But let's return to Outlook once again. Besides the search function as such, the second essential question is how the data is organised. You guessed right, this is not uniform at Microsoft either.

There are the so-called "PST files", with newer Outlook versions so-called "OST files" and if an Exchange Server is involved, we also have "EDB files" and so on.3).

If we delve a little deeper into the matter, we realise that the file format of these file types has not remained the same over the past few years, but that a lot is happening here. The only question is whether much "good" is being done.4).

Let's take a look at the specification of how a "PST file" is structured. In the 192-page specification we find the physical structure. This shows schematically the structure of how a "PST file", in which Outlook stores the messages, is built. Looks complicated? Yes, it probably is. There is already a structure, it's not like that, but de facto every further e-mail is simply written into this one "PST file", the next one too, and the next one is appended to it again at the end.5).

This is not technically correct, but simplified it can be said that in the worst case Outlook reads through this "PST file" from beginning to end and hopes that it will find the search term somewhere in the file and when it has found it, it can also read out in this gibberish in which message the content was found.

Let's go one better. The email from our initial example has no attachments and no particularly large content. It is 161 kilobytes (KB) in size.

If we technically extract the content of the email and put that into a Word file, we see how much cryptic information Outlook has to sift through just for this one message. 323 A4 pages for a single message.

With a typical 10 GB mailbox, there would then be well over 20 million DIN A 4 pages that would have to be searched every time a search is made in Outlook, and all this is supposed to work in a fraction of a second. Nothing against Microsoft, but this child has long since fallen into the well.

Summary

Where a proper database search, as is the case in our MailStore email archiving, already has undeniable advantages over the conventional Outlook search, manufacturers such as MailStore refine the whole thing even further at the user level. As we can see in the description of the storage technology specially provided by MailStore, even more full-text indices are created here per user. In these indexes, for example, the attachments are indexed so that MailStore's search can even find the text in attached PDF files quickly and reliably.7

So we have significant differences not only in the way the search function is technically implemented, but above all in the way the memory management works.

Our mail archive, on the other hand, has a real database with indices and full-text indexing. You can search properly here. Outlook, on the other hand, uses a chaotic gibberish in which presumably not even Microsoft developers today know exactly what is stored and read when, where and how. A targeted search is difficult and one cannot expect clear and complete lists of results.

Our tip: If you want a reliable Outlook search, then simply use the search in the mail archive. Unlike the normal Outlook search, the MailStore search does exactly what you expect.

List of sources

↑1https://techcommunity.microsoft.com/t5/outlook-global-customer-service/understanding-search-scopes-in-microsoft-outlook/ba-p/428841
↑2https://social.technet.microsoft.com/wiki/contents/articles/33929.exchange-2016-content-index-and-search.aspx
↑3https://support.microsoft.com/en-us/office/introduction-to-outlook-data-files-pst-and-ost-222eaf92-a995-45d9-bde2-f331f60e2790
↑4https://docs.microsoft.com/en-us/openspecs/office_file_formats/ms-pst/141923d5-15ab-4ef1-a524-6dce75aae546
↑5https://interoperability.blob.core.windows.net/files/MS-PST/%5bMS-PST%5d.pdf
↑6https://interoperability.blob.core.windows.net/files/MS-PST/%5bMS-PST%5d.pdf
↑7https://www.mailstore.com/de/produkte/mailstore-server/technologie/integrierte-speichertechnologie/
↑8https://www.mailstore.com/de/produkte/mailstore-server/technologie/integrierte-speichertechnologie/

image_pdfSave page as PDFimage_printPrint

About Author /

Leave a Comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Start typing and press Enter to search