Michael Arrington
TechCrunch.com
Today marks another milestone for San Francisco based contextual search engine Powerset. They’ve launched a showcase for their user search experience - effectively the search engine minus the web crawl. For now, Powerset queries only Wikipedia and augments results with data from Freebase. The product launch comes just a day after reports that the company is being shopped to potential buyers by investment bank Allen & Co.
I have been able to test Powerset via their labs site for the last few weeks. I wrote about it last month, and the version that just launched is very similar.
There is no way to look at Powerset today and determine if it can be as disruptive to search as Google was when it launched almost a decade ago. That’s because it only queries Wikipedia, and so there is little need for proper ranking algorithms to sort the good from the bad results.
But what user can see is how effective a way it is to gather information quickly. For someone doing research, Powerset effectively removes a number of steps towards getting to the final information. It is particularly effective when the information needed is on many different web pages.
For example, a query on Powerset of “when did earthquakes hit tokyo” yields stunning results. Try this query at Google or even wikipedia to compare - instead of just picking out keywords that are in your query and on a web page, Powerset is actually making some sense of the content included in the wikipedia pages:
The way that Powerset returns queries means that answers are often found in the result snips, as above. They are also structuring a lot of the Wikipedia and (and already structured Freebase) data and inserting it into results. So a search for “Bill Clinton” shows results, but also shows Freebase structured data along with additional query refinements to get to more information. The important thing below isn’t the structured data in the results, its the fact that you can click on the action words and drill down into very specific queries (to find, for example, what bills he signed, or which Supreme Court justices he nominated, or who he slept with).
Powerset is indexing web pages much differently than normal search engines, which generally just record content to match against keyword queries. Instead, Powerset is trying to understand the content on the page so that it can be matched meaningfully to queries later. Even queries that don’t use matching words.
Indexing the web is expensive, though, and Powerset’s way of doing it requires even more time and computing power dedicated to a web page. That’s why they say they aren’t indexing the entire web yet - the company has raised just $12.5 million (plus another $8 million or so in bridge loans from investors). To index the web will require a new round of financing (see the first paragraph above about their sale/financing efforts).
Powerset is has taken a lot of criticism for their goal of trying to redefine how people search the web ( including from us). But their lofty goals are what makes Silicon Valley so great - succeed or fail, Powerset is trying to do something pretty spectacular.
The company has also created a demo overview video - see below.
PowersetInformation provided by CrunchBase
Source : www.washingtonpost.com
Cognition.com appears to be ahead of Powerset. Here is why:
Cognition’s Semantic Natural Language Processing (NLP) technologies add word and phrase meaning and understanding to computer applications, providing a technology and/or end-user with actionable content based upon semantic knowledge. This understanding results in simultaneously much higher precision and recall of salient data within the universe of possible results. Cognition’s Semantic NLPTM makes technologies and applications more human-like in their understanding of language, thereby resulting in more robust applications, greater user satisfaction and new capabilities available for exploitation. On the Web in particular, powering applications with Cognition’s semantic understanding technology drives these applications ever closer to Web 3.0 (the semantic Web).
Cognition - Giving technologies new meaning.TM
Introduction
Cognition Technologies, Inc. (”Cognition”) is a next generation Semantic Natural Language Processing (NLP) company, based in Culver City, CA.
What is Semantic NLP?
Semantics is the sub-field of linguistics that is devoted to the study of meaning, as expressed by words, phrases, sentences, and even larger units of speech or text.
Natural Language Processing (NLP) is a sub-field of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages by computers.
Cognition’s Semantic NLPTM is technology that “understands” word and phrase meanings within context in modern computer applications. Cognition’s mission is to make its clients’ technologies and applications more human-like in the understanding of language and more profitable.
Cognition’s Semantic NLP has been in development for over 23 years by Dr. Kathleen Dahlgren, Cognition’s co-founder and CTO, and a team of linguists and computer scientists. Cognition’s technology employs a mix of linguistics and mathematical algorithms which has, in effect, taught the computer the meanings of virtually all the words and frequent phrases within the common English language. Semantic Natural Language Processing is superior to common pattern matching that is found in most search engines and text-interaction tools because it focuses on the understanding of word and phrase meanings within context. No other commercially available natural language processing technology comes close to Cognition in its breadth and depth of understanding the English language.
Statistics
Cognition’s Semantic NLP technology contains one of the world’s largest computational dictionaries. It includes:
506,000 Word Stems (the base forms of a word)
536,000 Concepts
17,000 Ambiguous Words - the most frequently used words in English language
191,000 Phrases
Over 4 million semantic contexts
76,000 synonym sets
Cognition’s place in the world related to the “Semantic Web” (Web 3.0) and Google
Cognition employs semantic technology to delve into the meaning of words and phrases, and unlike others who are trying to make the Semantic Web a reality through hand-tagging, such as Web Search, Cognition applies its Semantic NLP to other technologies to give these products and services a differentiation and competitive edge.
“We look at what we’re doing as a significant component to the Semantic Web,” said Scott Jarus, Cognition’s CEO, “Our focus on semantically enhancing other technologies means we’re not competing with Google, Yahoo! or other consumer Search engines. Indexing the entire World Wide Web ourselves is not currently on our business roadmap. However, we might become a semantic component of someone else’s application which may index deep content on the Web similar to the examples you can see on our Website.”
Management
Scott Jarus
Chief Executive Officer
Scott joined Cognition Technologies in 2006 as an investor and then as its CEO. Mr. Jarus has more than 25 years of management experience in the telecommunications and Internet industries, beginning with a company that built one of the world’s first public packet-data switching networks. Prior to joining the Cognition, Scott was President and chief executive of j2 Global Communications, Inc. (NASDAQ: JCOM), a profitable billion dollar market cap company whose signature product, eFax®, served more than 9.5 million customers with a local presence in more than 1,500 cities in 25 countries on 5 continents. Preceding j2 Global, Mr. Jarus was President and Chief Operating Officer for OnSite Access, the premier building-centric Integrated Communications Provider (voice, data, Internet and enhanced services) serving businesses in 22 markets throughout North America. In addition, he served in various senior management positions at RCN Telecom, Multimedia Medical Systems (which he co-founded) and Metromedia Communications.
Mr. Jarus serves on the Board of Directors of FreeConference.com and Ironclad Performance Wear [ICPW.OB]. In 2005, Mr. Jarus was named the National Ernst & Young Entrepreneur Of The Year for Media/Entertainment/Communications (and Los Angeles Entrepreneur Of The Year for Technology in 2004). He holds a Bachelor of Arts degree in Psychology and a Master of Business Administration degree from the University of Kansas.
Kathleen Dahlgren, PhD
CTO / Founder
Dr. Kathleen Dahlgren is the Founder and Chief Technology Officer of Cognition Technologies. She began her career as a professor of computational linguistics at Pitzer College of the Claremont Colleges and then worked for IBM at their Los Angeles Scientific Center, focusing on building a “natural language understanding system.” Dr. Dahlgren has a Ph.D. in Linguistics and a post-doctorate in Computer Science from the University of California, Los Angeles. She has published a number of scholarly articles on the subjects of linguistics and computer science, and is the author of Naive Semantics for Natural Language Understanding. She is the co-author of Cognition’s seminal patent (1998), and she received the Small Business Innovation Award from the U.S. Army in 1995. Currently, she is also an adjunct professor of Linguistics at the University of California, Los Angeles.