During my recent, reasonably long (and fully unplugged!) vacation, I was able to read David Weinberger’s latest work, Everything Is Miscellaneous: The Power of the New Digital Disorder. I enjoyed this book every bit as much as I enjoyed reading Small Pieces Loosely Joined.
David begins by asking how our ideas, organizations, and knowledge itself might change if we could arrange such concepts without the “silent limitations of the physical.” He immediately suggests that in such a world, being free (as in freedom) is not the desired result; being miscellaneous is.
In the process of making music miscellaneous, iTunes et al revealed that the natural unit of music is track, not album. Translating this to the world of ECM, what is the natural unit of content (or if you prefer, information)? Is it document, or is it something else? Does the answer depend on whether you sort it all out on the way in or sort it all out on the way out?
One of the early solutions from Documentum–long before its acquisition by EMC–provided the ability to take a collection of PowerPoint presentations and present the end user with a filtered collection of individual slides to promote visibility of already authored content and therefore increase the likelihood of content reuse via assembly. (Fast forward to the present and an offering like SlideShare.) Since then, XML has taken center stage along with macro-formats like ODF and Open XML, increasing the potential for chunking, decomposition, remixing, etc.
David defines three orders of order as follows:
- First: organize things themselves
- Second: separate information regarding first order objects (e.g. catalog)
- Third: digitizing content and metadata then being extravagant about placement/categorization/fulfillment
ECM operates largely in a third order world where traditional terms such as document, content and information are exploding–requiring long-held views to be rethought (e.g. are we talking about content or metadata? What is the difference between the two? What about indexing, full-text or otherwise?). Just when you near clarity the landscape shifts again (e.g. a binary/closed document format becomes a more open envelope of embedded documents–some content, some behavior, some presentation-related, etc.; a pivot occurs that swaps foreground concerns with background concerns–authors and publications, content and metadata, taxonomies and folksonomies, indices and relationships, etc.).
Is it fair to continue talking about structured information and unstructured information in the way largely batted around today (e.g. structured information fits neatly into rows and columns, typically within a database)? Or is this characterization increasing less black and white (e.g. databases handling BLOB’s, document assembly at runtime via a managed (structured) process, etc.)?
What other premises are accepted that can/should be re-thought (e.g. there is a set of appropriate criteria for finding–one right way to find)?
Returning to iTunes, browsing Apple’s online music store requires a particular approach (i.e. genre, artist, album–in that order) to find tunes of interest to buy. However, once you return to the iTune music player software, there is more freedom to order and sort your collection–from Apple’s store and/or elsewhere. Better yet, you can create playlists (i.e. pure metadata collections) to share with family and friends–and this is so popular that practically every digital music player supports the creation, import and export of playlists.
“Now that information is being commoditized, it has more value if it’s set free into the miscellaneous.” -David Weinberger
Arguably there are a number of content-related playlists already (e.g. bookmarks/favorites and sites like Delicious, feeds based in Atom or RSS, subscription outlines in OPML). Does your content management system satisfy your playlist needs? How do you share content-related playlists at work or outside of work (e.g. like you would share an .m3u file with a friend)?
I plan to post more about Everything Is Miscellaneous; there is certainly much more to this book.
In the meantime, my feed reader is enriched thanks to David’s references to the following thought leaders: Danah Boyd, Peter Morville, and Thomas Vander Wal–plus David Weinberger, too. Of course, in keeping with this post, you’ll find my updated “playlist” with these inputs now, too.








4 responses so far ↓
1 alexandra // Sep 16, 2007 at 3:05 pm
Interesting thoughts. I have always thought it is important to reflect on where and how (what format) we actually store and handle information since it affects what we can do with it later. Maybe the document is the equivalent to the album in your iTunes example. I always wanted to break down a text document in smaller discrete objects and look upon the document as merely a collection of those objects. Some examples of this thoughts can be seen in XML-based publishing concepts for brochures where both images and content can be different depending on geographic areas and target groups.
To me it would be interesting to bring that concept to a wider market for example people doing analysis work were paragraphs or even people, locations and things can be combined or at least described/referenced as objects but still looking like a text in a Word document.
I think the development of semantic technologies and OWL/RDF provides some promise in this area but there is still a lack of easy-to-use commercial applications where the content INSIDE the documents can be referenced and tagged. I would love to see Documentum to provide these technologies in the repository. Then we only need a client–in essence a new kind of Office application–to do the actual tagging/combining/writing. Maybe the easiest way is to leverage OpenOffice for a development project like this.
2 Craig Randall // Sep 16, 2007 at 3:18 pm
One of my follow-up posts is planned to be along the lines of “fuzzy categorization.” For example, can Semantic Web efforts be successful (in the way the Internet itself has defined success) as long as RDF triples are foundational? Are taxonomies good for anything more than jump-starting enterprise tagging? For example, is “taxonomy” the same as “initial folksonomy”? This all goes to participation versus control and contribution versus credentials. Cheers…
3 alexandra // Sep 16, 2007 at 4:30 pm
Fuzzy categorization…looking forward to that. I believe there are several aspects of the problem.
First there is a need to find an easy way to manage references of how pieces of content are used together to create new content. Something like the way iLife works on Mac OS X. You are writing a text document in Pages, bring up the media browser, drag-&-drop a photo and the use of it in that particular document is stored globally. That way it is easy to see in what documents (and in essence context) this image is being used.
The second part is the need be able to tag all content with metadata in different forms. I don’t actually think there is a contradiction between managed taxanomies and folksonomies. The must successful way is probably to have them interact. Just as only a part of our content can be manually tagged (for the rest we have to settle for some automatic measure) a managed taxonomy can only be that big. A folksonomy however, does not have that limitation. However, the managed taxonomy can regularly be updated based on developments in the folksonomy. Possibly content can have taxonomy tags, public folksonomy tags and even private folksonomy tags they don’t like to share (like Siderean does I believe)
Thirdly is the need to sometimes find a way to create a discrete and unambiguous classification of words in a text. The reason for that is that it allows for a fixed context for that particular word. Also it makes automatic reasoning with advanced queries possible. If I had a text saying “There is a Volvo, a Saab, a Ford and a Toyota on the parking lot” I can ask how many cars there are on the parking lot without the word “car” being mentioned. Also it means that I can ask how many Swedish cars there are since the ontology that is being used contains that information.
Success in the general audience is of course a good way to make things happen fast but do we need to wait for that when there is a need for these technologies already. Possibly the people needing it don’t yet know they need it, but they definitely do know that they have increasing troubles of handling vast amount of information effectively.
Also, “semantic object-referenced word processing” is probably not needed all the time but it could be an interesting alternative to custom database applications to store objects and there relations like used in most law enforcement agencies.
4 Document collaboration - questions // Feb 12, 2008 at 5:51 pm
[...] RSS ← Everything Is Miscellaneous [...]
You must log in to post a comment.