Search

May 08, 2008

aiimQ&A: Findability Webinar

This is my second and final post answering questions that were posed but  not answered during the AIIM webinar on Findability. (See earlier post for additional Q&A).

(Findability: the quality of being locatable or navigable, includes technologies and concepts such as Search, Taxonomies, Information Architecture, Auto-Classification, Agents, Discovery, Ontologies, and the Semantic Web.)

The webinar is available for download. I have also posted my slides to Slideshare, available below.

Q: Does Google's product replace the need for a document management product?

A:  Let me answer this question not just as it relates to the Google product,  but any and all findability tools – NO.  Remember these tools provide access to content – any content that is posted online.  These products typically do not make any guarantee regarding the validity or quality of that content.  That is the domain of complementary technologies such as content security, document management and web content management.   (Some solution providers do bundle these technologies together.)

Q:  When searches show lots of dirty results, how should you handle or cleanup metadata?

A:  This is a function of the underlying database in the document management or tagging system being used.  The functionality you are seeking is founded in traditional database processing, namely  field updates in batch mode.

Q:  How does universal search relate to OpenSearch API implemented by some Open Source products?

A:  Universal search, similar to federated search is the ability for a single search tool to search across multiple repositories and provide a single ranked listing of retrieved content.  It eliminates the need to issue individual queries in each repository.  Universal search.  The OpenSearch standard is a collection of simple formats for the sharing of search results. In other words, it is an approach to federate searches. My understanding of Universal Search, (brought up by Google in the webinar, and a term they use often to describe and differentiate their approach to searching across multiple repostories), is that it does not collect or share the search results (thus it would not be using the OpenSearch API), but rather directly accesses and indexes files in multiple repositories.  Universal search uses a connector framework. The connectors, are available under open source licenses but as far as I know, these connectors are not OpenSearch APIs.

Finally - the following comment was not posed during the webinar, but was sent to me shortly thereafter.  It is relevant and interesting enough that I chose to share it here.

Q:  Carl, your comment on the Finding Content webinar this afternoon about “aligning content to business strategy” could have been taken as a reference to Strategy Markup Language (StratML).

I hope Google is factoring the potential of StratML into their own strategy.  Whether each piece of “content” (which I prefer to call a “record”) has been associated with an organization’s strategic objectives or not might be considered to be a pretty important factor as to its relevancy ranking in enterprise search services.  It might also be a pretty good indicator to report to stakeholders with regard to how well an organization is managing itself.

A:  My point was that by designing specific approaches to retrieving and displaying content that are aligned to business goals and objectives, an organization can actually steer behavior or response from users of content. (A simple example of a commercial application is the "people who bought this product also buy these other products" type messages and links that emanate from searches in some online stores. These prompts are add-on features of the search engine that help to drive additional sales.)  The StratML standard is a very literal approach to the point I was making when I said findability could be used to align content to business strategy, i.e. StratML is an XML schema for strategic plans, which includes an approach to directly lining content to any or all facets of a strategy plan.  Its a bit of a different spin then what I had in mind, but I do like the comment made that if an organization was using this standard to tag content, the tags could be used as input to relevancy ranking algorithms.  Clever - food for thought.

aiimQ&A: Findability Webinar

Yesterday I had the privilege to present a webinar on topic of Findability as part of the  AIIM Wednesday webinar series.  (Findability: the quality of being locatable or navigable, includes technologies and concepts such as Search, Taxonomies, Information Architecture, Auto-Classification, Agents, Discovery, Ontologies, and the Semantic Web.)

I say privilege for two reasons.  First, Findability it is one of my most favorite topics (2nd only to knowledge management).  Second, the webinar gave me a chance to share some of the latest results from the survey we are running to support the upcoming AIIM Market IQ on Findability. – (BTW - if you haven't already, you can take the survey.)

The webinar is available for download. I have also posted my slides to Slideshare, available below.

The webinar was sponsored by Google, Baynote, Attivio and Systemware. As usual, the AIIM webinar attracted a lively crowd.  Approximately  250 attended attended, and also as usual we could not answer all of the questions that were posed during the presentation.  Following tradition, I am addressing the outstanding questions here in my blog.

Q:  I beg to differ that "in the consumer world, findability is not an issue". There are many daily searches I do, outside of Google (and some in) that are incredibly frustrating. I have no question here, but thought you should know...

A: (First let me explain for those who did not attend, this comment was made in reaction to a statement I made in my presentation to the effect that Findability is less of a issue on the commercial web, that most sites on the commercial web have strategically looked at this issue and strategically built findability into the site.) Sorry, my statement may have been misleading.  I was not proposing that all web-based experiences are "effective".  I was drawing attention to some web-sites, mostly those that are commercial in nature (e.g. amazon.com, itunes store, e-bay), that rely heavily on user interfaces to drive their business.  I made this statement to reinforce the survey finding that currently, web-based experiences with findability are driving increased demand for better findability within the enterprise/intranet. Many (not all) commercial website owners have taken time to fine-tune a findabilty strategy because, it many cases, the findability is absolutely vital to their existence.  (Imagine not being able to find the music you want in iTunes – you would abandon the site very quickly, resulting in lost revenue to Apple.)

That said, yes you are right, there are still many commercial web sites that do not provide a good findability experience.   (See earlier post in which I discuss this further).  And, clearly tehre are many internet sites (i.e. non-commercial/consumer-oriented)) whose findability rivals some of the worst  intranet/enterprise sites.  Findability is tricky and demands careful and targeted development of strategy and solution design.  Its an effort that some organizations have not yet appreciated teh value of - they will, in my opinion, especially as we can point to the effectiveness of the state-of-the-art found in some commercial web sites.

Q: Are there any other findability solutions apart from Google?

A:  Oh lord yes.  Google was the only sponsor that paid for a speaking role in the webinar, but there are many other  solution providers that address findability - some can complement a tool such as Google, some directly compete with a tool such as Google.   In fact, three other solution providers, Baynote, Attivio and Systemware also sponsored this webinar.  But the list of solution providers goes far beyond even these companies.  Rather than list them all here, I direct you to an exhaustive list posted on Dan Keldsen's blog, a list we used to vet the list of technology providers we query about in our survey.

Q:  Who typically owns the taxonomy in an organization?

A:  In many organizations, no one, which is part of the challenge in maintaining an information architecture or findability strategy.  According to a study I did nearly 3 years ago, when the taxonomy is an online resource, as part of a findability strategy, the taxonomy is usually (57% of those surveyed) owned by IT.  (The other 43% of responses were scattered across records managers, corporate librarians, LOB managers, end users and management.)  No matter who own it, recommended practice is to have a mix of disciplines involved in the process of developing it. While one might argue that IT should "own" the online taxonomy as a tool, in most cases IT should not be tasked with the definition and development of a taxonomy on business content, as they are likely not SMEs of the business content or its usage by the business community, which is the focus or value statement of a taxonomy within a findability strategy.

Stay tuned, there are three more questions that I will cover in the next post.

April 30, 2008

What Does Bill Gates Know that You Don't?

Microsoft recently purchased FAST for $1.2 Billion (USD), and is now attempting an acquisition of Yahoo. Google is now the number one most recognized brand in the world, for the second year in a row. The search market is HOT – and for good reason.

Content is useless if no one can FIND it. 

OK, to readers of my blog, this is not new (see earlier posts).  This is a shameless request for a favor. 

AIIM Market Intelligence is embarking on a market study on the state-of-the-art on Findability (the quality of being locatable or navigable, includes technologies and concepts such as Search, Taxonomies, Information Architecture, Auto-Classification, Agents, Discovery, Ontologies, and the Semantic Web),

and I would very much appreciate it if you can take a survey on the topic.  (You can find the survey at
http://aiimMarketIntelligence.questionpro.com/.

It should only take you about 20 minutes to take.

If you participate you will receive:
* An early FREE copy of the findings
* An invitation to a FREE live web briefing on the results
* A chance to win one of 25 gift certificates worth $25 USD for Amazon.com.

In either case, stay tuned, the Market IQ on Findability will be available the last week of June 2008, as will a webinar on the findings. Of course, I am sure to be blogging on the results before then as well.

Thank you.

April 06, 2008

Thought Leaders Meet at ECM Writers Summit

What a great week for ECM, Enterprise 2.0, SaaS and me.

Looking back on the past week, I have to say the ECM techno-geek side of me is smiling.  As I posted earlier, the week stated off with participating in a AIIM New England chapter event that included a panel of users that have adopted a SaaS model to ECM.  As if that wasn’t enough fun and education, the week ended with my moderating and participating in a thought leadership writer’s summit on SaaS & cloud computing, SOA & BPM, Social Computing & ECM, and Text Analytics.

The event was sponsored by EMC.  I again thank them for inviting me to co-host this summit. Some of the brightest strategists and technologists from EMC were there including Howard Shao, Mark Lewis, Whitney Tidmarsh, Razmik Abnous, Michael Hackney, Matt Coblentz,   and Lubor Ptacek.  More importantly, we were joined by a variety of ECM industry thought leaders including Nathaniel Palmer, Barclay Blair, David DeLong, Margie Semilof, Mary Cohodas, Geoff Bock, Bill Trippe, Vincent Berdot, Stephen Cameron, Christian Daems, Christos Varelas, Ron Miller and Beth Pariseau (see her post on this event).  (I apologize for he inevitable omission of others who were there, whose name I fail to recall at the moment.)

Well, as you can imagine, with such a crowd, the discussion of was lively and full of opinion (sometimes agreeing and sometimes differing.) The purpose of this post is to provide my recollection of the key points that came for the discussion.

Despite the variety of topics (SaaS & cloud computing, SOA & BPM, Social Computing & ECM, and Text Analytics), discussion almost always came back to a basic value proposition for ECM, striking a balance between increased access/collaboration, and content governance and security. (See the AIIM Market IQ on Content Security for more on this idea, and a post by summit participant Ron Miller.) Terms frequently uttered in discussion included mobility, social networks, collaboration, agility, flexibility, e-discovery, compliance and risk. Collectively these seem to represent the potential benefits associated with ECM.

ECM was frequently discussed not as a technology, or a single implementation, but as a platform, a competency that should be available across the entire enterprise. In this regard, the group often reiterated that solution providers and pundits of ECM all too often talk in terms of unstructured content, and that this is wrong.  ECM is about all forms of content – and therefore should provide a single integrated interface to the unstructured content (e.g. documents), as well as structured content (e.g. databases associated with ERP and payroll systems.)  Too much focus has been paid to the unstructured content separately and distinctly from the structured.

This single interface was extended to the concept of enterprise search.  We discussed that enterprise search has erroneously been discussed in the market far too often as a product.  The often touted single enterprise master taxonomy and search tool is not most effective.  In reality effective search across the enterprise will likely involve multiple search tools, taxonomies, relevancy rankings, etc., each finely tailored and tuned to specific content and use cases, but presented and managed as a single interface to the user.  The group agreed that this requires great complexity on the part of IT, but that complexity can and needs to be hidden for the user.   

We all acknowledged that the rules of publishing have changed.  On the positive side this has allowed faster and more wide scale dissemination of knowledge and experience.  On the other, this has created a demand for new approaches to demonstrate reliability and trust in “discovered” content.

Similarly, the long tail of electronic content (compared to the much shorter tail of paper content), necessitates more powerful approaches to management, retention, and findability.  Without it enterprise content can quickly become chaotic and/or grossly under utilized.

In this regard, Matt Coblentz of EMC proposed that “Content is Stupid”.  The group agreed, (or at least some did), with the addition that Content Management is intelligent.

We thought that overall culture was ahead of technology with regards to collaboration, but behind technology with regards to security and compliance.   

Some of us saw ECM in a state of evolution, progressively increasing functionality and ease of use over time.  Others argued that the advent of functionality such as SaaS and Enterprise 2.0 represents a hockey stick inflection point for the industry, that will be viewed as a revolutionary point in the market in time to come.

I for one walked away with a sense that ECM is once again a very exciting marketplace.  Ron Miller reminded us, that in his review of the AIIM 2008 show, he had indicated that the show was buzzing with excitement.

With that, the realm of ECM has become increasingly complex.  ECM is not just about technology, nor just about content.  The ECM practitioner MUST be concerned and involved in people, process and content (EMC’s words), or content, community and context (my words.) This is what keeps this market place alive and vibrant.  This is what affords careers and debates that go far beyond technology alone.

On a final note, I will share a light moment. To a large degree there was much reaffirming among this  group, as opposed to learning. There were some exceptions. Two new technologies were introduced:  “blockies and wigs.”  These terms were coined by one of the speakers in a slip of the tongue in his excitement over the power of “wikis and blogs”.  We all got a good laugh out of it. OK – maybe you had to be there, or maybe you just aren’t ECM-geeky enough.

April 01, 2008

aiimALERT: HP climbs Tower to ECM; Stops Short of Top Floor

Yesterday HP and Tower Software announced a pre-bid agreement for HP to acquire Tower. (See press release.)

The independent/standalone ECM solution market got 1 vendor less with this acquisition, leaving this segment of the market extremely lean.  Are Interwoven and Open Text the only two top tier players left alone (belles of the ball or wall flowers?).

OK, that aside, what does this mean specifically for HP and Tower?  According to the press release, " The acquisition of Tower will add electronic records management to HP Software’s existing e-discovery and compliance capabilities in information collection and retention."  No argument there, nothing very new there either.  Under an pre-existing agreement, Tower TRIM Context was already integrated with the HP Integrated Archive Platform, providing records management within the compliance archiving platform. 

Although Tower has consistently touted its records management capabilities, it did so within a full ECM suite, that includes functionality such as workflow, document management, e-mail management, document assembly, web content management and, to a certain degree, collaboration (i.e. a small step into the Enterprise 2.0 arena). 

There is more potential to this acquisition than HP is touting in the release. Are they being near-sighted based on the previous partner relationship?  It would behoove HP to re-assess the value of this acquisition and perhaps position these new capabilities beyond records management, e-discovery and compliance, into the realm of a full fledged ECM platform offering from a systems and hardware provider, putting them head-to-head with the likes of IBM and EMC.

March 18, 2008

aiimALERT:Teragram Acquired in SASsy Move

SAS announced the acquisition of  Teragram, a  natural language processing (NLP), taxonomy generation and linguistic analysis technology provider. The acquisition will enhance SAS’ text mining and analytical BI offerings, and extend them to enterprise and mobile search. Terms of the acquisition deal were not disclosed. (See announcement.)

Typically business intelligence is associated with structured data, data warehousing and statistical models.  Those in the ECM industry who have followed the search and retrieval market know, however,  that search and taxonomy software from vendors such as Teragram, as well as Autonomy, InXight, Vivisimo, Stratify, Clear Forest, Factiva, IBM, Xerox, MondecaFAST, Mondosoft, has been capable of similar intelligence and analysis of unstructured content for several years .

Despite that, all too often, enterprise search initiatives focus exclusively on search and retrieval.  While a most important and fundamental component to any ECM strategy, enterprises can extract a much higher level of value and insight if they appeciate and leverage these search and text management tools beyond search.   

This acquisition by SAS marks a new level of market recognition of the power of natural language processing.  SAS is a notable player in the BI space.  Their recognition that the level of insights possible from properly managing structured content must also be extended to unstructured content is a wake-up call to everyone.  Unstructured content grows exponentially on a daily basis, including resources such as web pages, word processing files, presentations, e-mail, to name a few. Text analytics, mining and management technology, that leverages NLP can mine the intelligence that is contained in these resources, individually, and more powerfully collectively.  The latter ability is a form of emergent technology, a function that is fundamental to Enterprise 2.0, the focus of our upcoming Market IQ (register for webinar now.) In deed, search is identified as a key enabling technology for Enterprise 2.0 in the Market IQ, and in our Enterprise 2.0 training program.

For this reason, and others, the subject of text management, search and analytics will be covered in great detail in the AIIM Q2 Market IQ on Findability.  This report will expound on the fact that there is much more to search and taxonomy than locating content, typically the focus of enterprise search.  Search and retrieval are just the tip of the iceberg.  Findability leverages the full power of NLP. The technologies that enable intelligent responses to user generated queries can also be the basis of emergent wisdom from content collections, sentiment detection, trends analysis and predictive analysis.  These applications quickly extend NLP from the enterprise search market squarely into BI, risk management and knowledge management.

January 08, 2008

aiimALERT: Microsoft Takes FAST Track to Search Market

Microsoft Corp has offered to buy search software company Fast Search & Transfer for about $1.2 billion, the companies. Microsoft and FAST will hold a telephone conference to discuss the offer today at 1815 GMT. (See details )

This event is sure to shake up the enterprise search market.  But will it be the final earthquake?  The search market has always seen platform players such as Microsoft and Oracle make forays into search.  The appeal of these offerings has always been lower cost and integration into applications, leaving a market for higher end dedicated search functionality providers.  Dedicated search providers, such as FAST as well as Autonomy, Endeca and Vivisimo, to name a few, remained viable and healthy solution providers by offering state-of-the-industry search functionality.  Organizations that required substantial, independent search functionality continued to represent a fertile market for these high-end search tools. 

The likely acquisition of one of these leading search vendors (FAST), by a major platform player (Microsoft), changes the rules of that game.  The high-end functionality and expertise of FAST, coupled with the market reach of Microsoft is greatly complementary.  “This acquisition gives FAST an exciting way to spread our cutting-edge search technologies and innovations to more and more organizations across the world,” said John Lervik, CEO of FAST. “By joining Microsoft, we can benefit from the momentum behind the SharePoint business productivity platform to really empower a broader set of users through Microsoft’s strong sales and marketing network.”  Lervik certainly got that right.  (It is a shame he was not as insightful earlier this year, when he commented "We are diappointed with [our] numbers", earlier this year, in a move that certainly helped precipitate this bid.) On the other hand, Microsoft will also greatly benefit in making its SharePoint platform far more attractive, and can likely extend the search capability of “its FAST” across all Microsoft applications, rendering a virtual universal search platform.

While all search technology providers are affected by this probable acquisition, Google is perhaps the most impacted. Google has been making strategic and powerful plays into the enterprise search space, leveraging their popularity on the Internet into an enterprise search tool.  Market popularity and familiarity have greatly helped in this cause.  If anyone can claim equal name recognition in the general market, it is Microsoft, albeit not with search – until today.

This of course changes the value proposition of ECM providers that have positioned themselves as Sharepoint enhancing - including OpenText and Documentum.  Their window of differentiation continues to close.  I cannot help but wonder at which point Microsoft will find themselves back in antitrust court?

Should AIIM Market Intelligence scrap its plans for a Market IQ report on Findability in Q2 2008?  Is the subject moot?  Perhaps in the long term, but short term, this should make for a most exciting competitive market space.

January 02, 2008

Searching Into 2008

At the close of 2007 I had the privilege to be a co-presenter on a Google webinar focused on Universal Search.   I have posted the slides used to slideshare  (Slideshare Link) and I have posted the slides below:

In the presentation I reference a market report conducted by my former company, Delphi Group, in 2006.  We found that the majority, 63% of business users, utilize 2-3 search tools in a typical week.  Another 29% used more than 3.  The reality of Enterprise Search is that it does not exist as a single entity or functionality.  For most business users, search is a multi-tool experience, for a variety of reasons.

During the Google webinar, we polled the audience of approximately 150 people and asked how many search tools they currently used.  As the chart below indicates, the results showed that the situation is only getting worse.  That is to say, 40% of the audience that day indicated that they use more than 3 search tools daily. 

Googletools_2

[A full 20% claim to use only one search tool (over zealous Google customers?).  Quite surprisingly, 7% claim they do not use any search tools.  Who are these people - gifted with divine wisdom?]

While search functionality may be improving, the positioning of search as a enterprise platform or competency is still not a reality. Indeed, we are moving further away from that goal. Users are expected to use any number of search tools to find relevant content in multiple repositories and applications.

While it can be argued that the criticality of search may warrant multiple search engines in-house, each tailored to a particular set of content and/or user need, one should hope that a common interface that could cut across all content and dynamically support all types of queries should be forthcoming.  Without such an approach, there is significant time and frustration spent thrashing back and forth between collections, search tools and interfaces.  (The Delphi study included in my slides has statistics on this phenomena as well.)

During the Google webinar, we also polled the audience regarding the percentage of enterprise content  that is searchable.   As the chart below indicates,  47% claim that 50% or more of the enterprise content is discoverable (only 12% indicated that >75% of content is discoverable.)

Googledata

Clearly, we still have a ways to go before enterprise search can be considered truly enterprise-wide:  encompassing all enterprise content, utilizing a single interface with consistent functionality and results across all content.

These issues, and more are sure to emerge and be investigated more deeply in our upcoming Market IQ, slated for publication in Q2 2008.  Stay tuned.

November 28, 2007

aiimALERT: Search - It Ain't Over Until the Fat Lady Sings

IBM and Yahoo announced the third release (version 8.4.2) of OmniFind. (See release.)

Perhaps no ECM technology has ridden a more tumultuous roller coaster than search.  Search was in teh limelight in the late 1980s, went into near obscurity via integration in the mid-1990s, and is now hotter than ever.  With the advent of the internet, the power of effective search against vast content collections became obvious for business users, both inside and outside the firewall.   

Of late, attention has predominately been on the companies such as Google, FAST and Endeca.  These companies and their respective products do indeed represent new offerings for enterprise search, but with this announcement by IBM, we are reminded that search is still a focus for solution providers such as IBM (among others, Xerox and Autonomy).  Indeed, IBM must be credited with introducing text-based search to the market with its Stairs product more than a quarter of a century ago.  Much has been done to enhance Search from IBM since those early days.  This grandad - or  "fat lady" of the search market continues to be a player in the game with enhancements being made not only to their Yahoo internet edition, but to the OmniFind Enterprise Search tool as well.  Search ain't over yet folks, and this is likely not this fat lady's final aria.

November 15, 2007

Enterprise 2.0 and Google Docs - Oh the Irony

We are in the throes of developing our next Market IQ, and AIIM training course on Enterprise 2.0.  It is interesting that, as a result, we, the AIIM Market Intelligence Group, find ourselves using many Enterprise 2.0 collaborative tools as we undertake this collaborative authoring and research project.  Too many people (in our opinion, we'll see what the research shows), nearly exclusively target wikis and blogs as the tools of Enterprise 2.0.  While we are indeed using such tools, we are also using simple straightforward shared document authoring tools, namely Google Docs.  Google is to be commended for providing this toolset as freeware.  Its an excellent example of an Enterprise 2.0 tool (collaborative, web based, easy to deploy, low cost of entry).

It is ironic though, that this environment does not provide a tangential, often overlooked, technology, namely search.  While some may argue that search is not an Enterprise 2.0 technology, I would argue that it is clearly a related and valuable tool to "integrate" into the Enterprise 2.0 environment.  Too many wikis and blogs that I encounter provide little to no search capability.  This is a very real shortcoming for obvious reasons.  But in this case, the lack of such functionality is more than frustrating, and an oversight, it is downright ironic given that the provider of the platform is Google. 

Last week, I spoke at an AIIM Webinar sponsored by Google (access recorded webinar).  The webinar focused on universal search.  Google is positioning their enterprise search tool as just that, a single search platform that can cut across, integrate into, virtually any and all content repositories, providing a single point of search.  Such functionality promises to end the search silos, causes of frustration of so many knowledge workers who find themselves the users of multiple search tools, which was a major point I made in my presentation during the webinar. Invoking the browser search within GoogleDocs (in my case Firefox), provides me with search, but again as a silo, not via an enterprise search experience, not via a "standardized search box", not via Google.

This is  likely just  a symptom  of a nascent technology genre, an issue we plan to delve into in the upcoming Market IQ.