The search engines are again under the legislative spotlight. Their powers of data retention are being eroded further. But there’s a deeper problem — take away personal data and the future of web applications is imperiled…
In their wisdom, the European Commission advisory body have convened regarding data protection, advising:
“Search engines should delete personal data held about their users within six months.”
But the thing is, six months is not nearly enough time to get value from the data the search engines keep on each person using their services — and here’s why…
In search of the Found Engine
If the likes of Google are to offer a new class of services that melds search with productivity tools, like Google Apps, Google et al need to be able to store person data for longer than 18 months:
“The very fact that Google are even talking about keeping our personal search data longer than eighteen months demonstrates their thinking. If you want personal search to work, then the likes of thee & me need to share more personal information with the search engines…
There’s no way around with this particular search information paradox, one not solved by any purely observation routines and algorithms that Google et al might have in place.”
But what sort of applications are we talking about? Pretty simple, yet immensely effective applications that learn and adapt to our personal preferences, a concept I call the Found Engine:
“But it’s not so much a question of the search engines knowing more about you and what you want, but more about the search engines knowing what you’re doing and what you want the information for.
In the real world, you start a complex question with some background information. So if the search engines want to be as smart people, then people have to be prepared to take the extra time and make the extra effort.”
And that’s at the heart of what I believe is going to be forefront of web-based office productivity applications; software that takes away the search and replaces it with the found.
Assuming you’re using using these future, next generation applications, you’re going to be adding events to calendars, writing up reports in word processors, mapping out sales forecasts in spreadsheets and building a showreel or a slideshow in a presentation package.
Additionally, you’re going to be sharing these documents with colleagues, in some cases, allowing them to collaborate with you. After a while, such software will know where you’re looking for answers to certain topics, what kind of data you’re adding into your presentations and spreadsheets — when, where from and eventually why.
Before long, you’re no longer searching for things, they just get found…
Found Engine — a future case study
It’s early Monday morning. Tomorrow is the monthly sales meeting and you’re eager to get a key point across to your colleagues and the sales director. There’s a lot of research to be done, but you’re no overly concerned.
As usual, you sign into the company intranet and within a moment or two, a slew of documents appear on your profile page. The documents are a series of found matches for what the software anticipates you’re going to need for your meeting tomorrow.
“Would you like to open a spreadsheet document now?” Asks the system, knowing you usually start to plan out your sales data at this time.
There’s been a steady stream of suggested readings all week, but the system is ramping up towards the date of the meeting. Some of the suggestions aren’t all that good, so your vote them down. Those that you use are automatically voted up, but you also get a chance to assign a score out of 10.
All the while, the system passively learns, making notes of your preferences, ensuring the next round of suggestions will most likely be more relevant.
This isn’t just like a fancy reminder to buy flowers for your wife’s birthday, this is office productivity as it should be. Sure, you can still go out there and find stuff, but for the most part, stuff is just found, because the systems you’re using know enough about what you need to get your job done.
Privacy versus Productivity?
In a roundabout fashion, the European Commission do make provision for long-term storage of personal search data:
“In case search engine providers retain personal data longer than six months, they must demonstrate comprehensively that it is strictly necessary for the service.”
Sadly, and rather bizarrely, the following clause all but contradicts its the previous one:
“It is not necessary to collect additional personal data from individual users in order to be able to perform the service of delivering search results and advertisements.”
While the legislators seem keen on honing in on the purpose for which personal search data is stored, — and that the search engines clarify their multi-faceted agendas — the legislators themselves could do with clarifying their own rulings, given their contradictory and seemingly blunt, argumentative nature.
Now, pre-emting any claim that I’m riding roughshod over privacy, I want to make it clear that I’m not. There’s always a balance to be found, but if it’s balance, then the the European Commission want to turn their attention to governments, who tend to lose, misplace or otherwise have stolen data with greater frequency than any of the leading search engine businesses.
Case in point being the British Governments Ministry of Defense, who’ve lost 87 USB sticks since 2006. But that’s only the tip of the iceberg; the sheer number of incidents involving lost data by the British government is considerable:
“First of all, let’s begin with four simple words: British government data loss. Small words, I grant you that, but four words with enormous repercussions.
In recent times, the British government have managed to misplace, lose, destroy and variously dispose of a great many items of personal data pertaining to possibly most if not all of the subjects of her majesty’s realm.”
Personally, I think companies like Google are too easy a target. Maybe there’s an argument for a standardized, formal method of dealing with data and “anonymizing” it for commercial use? I should say there is, but I would caution against legal intervention with regards to the business activities of Google, Microsoft and Yahoo! since such rulings could easily set uncomfortable precedents for everyone else.
“[Article 29 Data Protection Working Party] said ‘search engine providers must delete or irreversibly anonymise personal data once they no longer serve the specified and legitimate purpose they were collected for’.”
When I read such things, I feel a tinge of embarrassment on behalf of the the authors; either they’re being willfully argumentative, or they’re incredibly naive.
Going back to my Found Engine concept, such things simply couldn’t function without a prior knowledge of the people using such services. Maybe an end-user license would be a way around this political impasse?
One way to look at the issue of data retention verses service quality is as if you were a business person, like I am. For me to provide a solid, personalized service to my clients, I need to know as much as I can about their specific needs.
If I was to suddenly forget everything I know in six month chunks, just what kind of service do you think I’d be offering? Or, to look at it in pure, hard business terms, how many clients would I have left in twelve months time?
In fairness to European Commission — and in part allaying concerns of legislators dictating private sector business practices — because privacy is such a hot topic these days, it would behoove the search engine businesses to at least make clear their intentions. But I suspect they’re intentions hardly nefarious, or underhand.
Ultimately, the rights and the needs of consumers are the main focus of legislators and the search engine providers respectively. How a balance will be struck is yet to be decided. But as it stands, if search is the question, we’re all still looking for answers…