Monthly Archives: May 2006

More Knowledge Network

As you may recall from my previous post, Knowledge Network (KN) is about unlocking tacit knowledge in the enterprise (versus explicit knowledge). Glen Anderson, Group Product Manager, presented an “under the hood” view of KN, following on the heels of John Hand’s overview presentation yesterday (ref).

The first topic discussed was client profile creation, which is accomplished in four phases: select information (i.e. data sources to analyze such as the primary data source of Outlook folders), run analysis, review profile and publish profile. Client profile creation follows the Notification-Control-Consent (NCC) privacy model I relayed yesterday, which is also shown graphically here:

KN Notification-Control-Consent Privacy Model

The initial analysis phase (#2 within the overall client profile creation process) also consists of four phases as follows:

1. Sync

  • Read each email. Essentially only the first paragraph of the email body is parsed by KN; attachments are not scanned.
  • Capture key “interaction data” in a local Access database.
  • Read in contacts from Outlook and IM clients.
  • Sync colleagues from SharePoint Portal Server 2007 profile.

2. Contact resolution

  • Lookup contacts against the Global Address List (GAL) (MAPI).
  • Internal or external?
  • Is the contact a distribution list? If so, discard.
  • Capture key properties.

3. Update

  • Aggregate counts.
  • Check thresholds.
  • Calculate strength.

4. Recommendation

  • “Exclusion lists”
  • Special rules
  • Limits applied
  • Organization name mapping for external contacts

KN analysis is captured locally by the KNClient.log and MDB files located within %UserProfile%\Local Settings\Application Data\Microsoft\Knowledge Network. There are two Access databases on the local client machine: one has the raw data (i.e. parts of emails), and the other reflects your actual profile choices.

After talking about client profile creation, Glen went on to address some of the top questions customers have raised about KN.

Why client-side mining?

  • Privacy – Nothing leaves user’s machine until they “publish.”
  • Access to information – PSTs, future data sources–I’d certainly like to understand what some of these potential data source might become.
  • Distribute the processing.

Why not mine sources other than email?

  • Email is by far the richest and most pervasive source today.
  • Calculating strength across different data sources adds complexity–I’d like to understand the nature of this complexity in more detail.

How long does the analysis process take?

  • Depends on a number of factors (e.g. on the amount of email and unique contacts; on disk performance, RAM, and CPU; on user activity–since analysis process runs at low priority)
  • Microsoft has measured anywhere from 5 minutes (3K emails) to 12 hours (120K emails)–further optimization prior to release is planned.

The rest of Glen’s presentation covered SharePoint server integration (e.g. mention of a KN Profile Management Web Service for publishing client-side, Access-based data to the server-side KN store), privacy and anonymous brokering (i.e. a process of connecting seekers and targets anonynously in a manner reminiscent of referrals on LinkedIn), deployment and administration, and extensibility.

On the last subject of extensibility, there is a managed API on the client-side to access the KN database. On the server-side, there are at least two web services: one that exposes a full-fledged query language for expertise/social network people search and another to retrieve (read) and augment (write) profile data (e.g. bootstrap the profile or add custom keywords and contacts). It will be interesting to examine the WSDL of these web services (e.g. via the KN SDK) to understand how to potentially introduce other data sources into the KN system beyond email (e.g. content subscriptions via feeds, authoring trend data from content repositories, discussion thread contributions in collaboration stores, etc.).

Today’s KN session finished with a fair amount of Q&A at the end, indicating clear product interest.

For more KN information, stay tuned here. I also recommend visiting the KN team blog.

Microsoft Knowledge Network 101

Today’s first breakout session slot at the Microsoft SharePoint Conference 2006 featured an engaging presentation (i.e. product launch) on “Knowledge Network for Microsoft Office SharePoint Server 2007.” The talk was engaging both because the material, Knowledge Network (KN), is all new and also because the presenter, John Hand, Senior Marketing Manager, Information Worker Greenhouse, Microsoft, was articulate and humorous. (John, I think your comments about Bush just increased your personal KN. :-))

“Knowledge Network for Microsoft Office SharePoint Server 2007″ (herein referred simply as KN) is a version 1.0 product add-on for SharePoint 2007 (i.e. it’s not a standalone product). Today was it’s first public showing by John, and you could definitely sense his excitement and pride in the process. Prior to today there were some three dozen executive-level briefings under NDA as well as an internal (?) beta program that reached 814 users at its peak–codename “Knowledge Interchange” or simply “KI.”

The product represents the first commercial offering from a relatively new, tight organization known as Information Worker Greenhouse (IWG). KN was produced by IWG with only a total team of 12 individuals, including dev, test and (per John) “overhead” (i.e. all inclusive).

KN is all about connecting people to people. KN is software for enterprise social networking that helps users collaborate more effectively by automating the discovery and sharing of undocumented knowledge and relationships.

  • Who knows whom? (Connectors)
  • Who knows what? (Expertise Location)

Microsoft is careful to differentiate KN from manual systems typically deployed today for expertise location. Also, Microsoft points out that it’s expertise, not expert, location. That is, KN is not about queuing up to talk to the guru; it’s about organically identifying capable resources to make better decisions more quickly.

Microsoft is creating KN based on three core beliefs:

  • Most information is undocumented
  • It’s difficult to connect to the right person
  • “Weak ties” deliver significant value

Most knowledge is undocumented. Employees are more likely to turn to colleagues for information (than to systems), based on social networking research. This knowledge isn’t stored in documents or in databases. It’s stored in people’s heads (80-20, 80% individual knowledge; 20 % documented knowledge)…and the baby boom is about to walk out the door!

It’s difficult to connect to the right person. Finding the right person often involves a referral by an intermediary, and KN acknowledges this phenomenon transparently.

“Weak ties” deliver significant value. People in our inner circle know basically the same people and the same things that we know, drawing upon the original 1973 work of Mark Granovetter, “The Strength of Weak Ties.” In other words, the first degree is homogeneous and represents colleagues. The second degree (periphery) adds value and represents colleagues’ colleagues. The third degree (edge) represents everyone else.

The Knowledge Network solution features a client and a server that work together to serve “seekers” as follows:

  1. (Client) Analyze email to create profile (keywords, contacts, external contacts)
  2. (Client) Publish profile to server (incremental updates)
  3. (Server) Aggregate profiles (expertise information, social network)
  4. (Seeker) Search for people (who know who? Who knows what?)

Results are presented to the seeker in SharePoint, ranked by social distance to seekers and relevance in profiles.

KN is a “bottoms-up” relationship tracking system. At the bottom are relationship data sources, or digital clues, such as email, team sites, IM contacts, contact lists, portraits, my sites, Active Directory relationships (i.e. AD is required), and distribution lists. Relationship data sources are inputs for the KN Correlation Store, which is surfaced with the SharePoint browser experience. The Correlation Store is also accessible to line-of-business applications via APIs and an SDK.

Microsoft understands that there may be privacy concerns involving KN, and believes that addressing these concerns involves striking the right balance among utility, simplicity and privacy.

  • Utility is how useful will this software be to me?
  • Simplicity is how easy will this software be to install, upload, maintain and use?
  • Privacy is how much personal information will this software reveal and how much control do I have?

Therefore, the following simple privacy model (NCC) is employed:


  • Communicate steps of the profile creation and publication process
  • Customers can expose privacy policy in the client profile wizard


  • User can choose which items to include/exclude
  • User can choose from 5 levels of privacy to apply to each profile item to control who is allowed to view that information on the server (i.e. everyone (outermost ring of visibility), my colleagues, my workgroup, my manager, and me (center of the visibility universe))
  • Admins can configure the default operation of the client, including opt-in/opt-out and the default privacy visibilities for profile items
  • Admins can determine which aspects of the product functionality to leverage including external contacts, anonymous results, and DL keywords


  • KN sends no data to the server before the user has approved it

KN makes Microsoft’s “people ready” campaign tangible. For example, using KN you can render a spider chart of interactions and then contrast that to actual organizational structure to see how optimal your organization may or may not be.

I’m looking forward to the 300-level (under the covers) presentation tomorrow.

Update 5/17/2006 at 4:30pm: Mary Jo Foley posted two articles on IWG in the past (ref. [1] and [2] ). It’s interesting to compare their speculation with today’s reality, which she also speculates on, too (ref).

Rocky is right; software is too hard

While the Microsoft Architect Advisory Board (MAAB) was still active, I had the opportunity to work with Rocky Lhotka in the Smart Client Architectures working group. I’ve subscribed to his blog for awhile now and flagged both of his posts that software is too hard ( [1] and [2]) to read on my flight up to next week’s SharePoint conference. Today’s mail included a copy of Visual Studio magazine featuring a guest column by Rocky adapted from his first blog post. It was all the nudge I needed to stop and read the article “Software Is Too Darn Hard.”

This is timely advice for all of us in the software industry and certainly for me personally as I focus on services and service orientation (e.g. value yielded, not architectural polish).

Useful responsiveness buckets

The following bins of responsiveness (i.e. response times) based on HCI research at Carnegie Mellon University resonnate with me:

  • Crisp: < 150 ms;
  • Noticeable to annoying: 150 ms to one second;
  • Annoying: one to two seconds;
  • Unacceptable: two to five seconds; and
  • Unusable: > five seconds.

The context in which these bins were described appears in the March 2006 issues of IEEE Computer: “Quantifying Interactive User Experience on Thin Clients.”