• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • Services
    • Vendor Advisory Services
    • IT Advisory Services
    • Business Advisory Services
    • Serious Insights Agile Thinking Workshops
    • Innovation Workshops
    • Serious Insights Keynotes
    • Strategy Advisory Services
    • Thought Leadership & Content Marketing
  • Reviews
    • All Hardware Reviews
    • Headphone Reviews
    • USB-C Hub Reviews
    • SeriousPop.Tech
    • Software Reviews
  • Advisory Research
    • Serious Insights on AI
    • Serious Insights Interviews
    • Strategy & Scenario Planning
    • Serious Insights on Collaboration
    • Hybrid Work
    • Knowledge Management
    • Management
    • Learning Reimagined
    • Serious Insights: The 10s
    • Special Reports
    • Sponsored Research
    • USG Scenario Planning Videos
  • About Us
    • About Serious Insights
    • About Daniel W. Rasmus
    • Daniel W. Rasmus Appearances
    • Daniel W. Rasmus Videos
    • Clients
    • Headshots
    • Books
      • Management by Design
      • Listening to the Future
      • Twelve Ways to Escape an Alien
      • Older Books
    • Daniel W. Rasmus World Travel
    • Dan’s Quotes
    • Community
    • Site Disclaimer
    • Privacy Policy
  • News
  • Contact Us
    • Contact Us
    • Book Daniel W. Rasmus
    • Serious Bookkeeping
    • Product Evaluation Request Form
    • Wedding Ceremonies
Serious Insights

Serious Insights

Research and reviews from strategist, futurist and analyst Daniel W. Rasmus

Follow Us

  • Facebook
  • X
  • LinkedIn
  • YouTube
  • Instagram

The Problem With Document References and How Knowledge Management Fails Us

October 27, 2022 by Daniel W. Rasmus Leave a Comment

The Problem With Document References and How Knowledge Management Fails Us

The other day a colleague called and asked about the best way to find a concept within an enterprise document repository. 

After discussion, an enterprise repository was a bit of a stretch. What she meant was find the concept across the entire enterprise, regardless of where it was stored.

The target concept consisted of a set of management principles. In the current incarnation, the enterprise touted seven principles. After working with a number of consultants, it had honed its principles to five. The goal was to find all references to the seven principles and replace those references with the five.

I informed my colleague that there was no way to accomplish that task completely, even with the most sophisticated knowledge management system. I shared the following reasoning:

  • Structured documents do not structure around concepts; they structure around document organization, such as headings, tables, and other elements. This type of structure adds no value in a discovery task like this.
  • Search would likely find many, but not all documents, for many reasons, including inaccurate references (such as a reference to leadership principles rather than management principles, colloquialization, partial references (such as “in management principle number 7,” or “our principles”), and various technical issues like the precision and recall associated with the search algorithm. Also, because modern search engines typically seek relevance first, Intranet references that demonstrate use and inbound links may well appear, but those documents that live on their own don’t benefit from any social structure that the search engine would account for.
  • While most documents in formal repositories use tags, it is likely that the number of documents that reference the concept will be relatively small compared to the number of times those formal documents are referenced. Further, tags require their own management. The idea of “management principles” may not be an assigned tag, and if not, the tags will not contribute to the discoverability of the target documents.
  • Not all documents that reference the concept will be discoverable, such as downloaded copies, copies stored on removable media, copies e-mailed outside of the organization, or copies stored on servers not part of the enterprise index.

Document references: The reference problem

A reference problem quickly follows the discovery problem. Just finding the concept is not enough. The documents that reference the concept may reference it in several places and in different ways. A concept may be referenced in the text, and it may be referenced in a link (as a URL embedded in text that may not directly reference the content of the link). The concept may also be referenced in a footnote or endnote or perhaps in an illustration or table. While the “full official name” of it may be referenced for highly curated documents, the concept is just as likely to be referenced in some other way in less formal content.

The ambiguity of references makes it impossible to discover all the instances through search. The search set will likely prove incomplete in that all of the ways the concept was referenced were not maintained, and therefore, some documents will escape discovery because they reference the concept in a unique way.

The Problem With Document References and How Knowledge Management Fails Us tablet document

Practically, tools, such as internal web metrics that count document hits based on keyword searches, will discover the majority of the documents that people actually look for and reference; it will not surface documents that exist but are seldom clicked on that are perhaps not referenced elsewhere. Again, practically, those documents may not matter, but they will still exist and could be discovered in the future, offering a less-than-accurate view to readers.

A secondary issue comes in references to the references, such as documents that talk to particular principles, like three through five. Four and five in that document, however, no longer apply, making the document mostly irrelevant—even if the discussion about item three proves brilliant. The document would need to be rewritten to reflect the new principles if it was to retain any value.

Further, the secondary issue also includes which principles are assigned to which numbers. Are the new five the first five, or are they a different five? In that case, any document that goes beyond referencing the principles in total as a distinctive object will need to be rewritten. All references become invalid as the object, despite the same name, as it represents a different object.

Revisiting Hypertext

Although the problem of managing internal references has been known for decades, no deployed enterprise system adequately addresses it. Most knowledge management and document management systems deal with documents as objects, sometimes as a collection of markup structures—but rarely deconstructs them at the most primitive level of sentences, paragraphs, and concepts. Documents built using Hypertext presume well-formed content that leverages object reference across documents. 

Ted Nelson, the inventor of Hypertext, designed Hypertext systems to implement bi-directional links, not the unidirectional links that we know from the World Wide Web. That means in a well-implemented system, a reference object, like the management principles, would be able to be queried for all the documents in which it is referenced. Hypertext adherents refer to the content most of us work with today as Lump Files, meaning that they are lumps of content together that can’t be parsed by systems to expose their links, and other documents can only link to them in total, not as components.

The use of Hypertext, in its most complete form, however, never took off. Some initial Hypertext ideas persist, like those on the Project Xanadu site (see the Wikipedia entry here). After a number of moves and acquisitions and continued research, the idea now languishes on edge websites that keep the concept alive without moving it forward.

TRANSCOPYRIGHT Hypertext was devised as a universal publishing system. Because of that, Hypertext in Nelson’s implementations forces the need for copyright that recognizes references to even small portions of content that need to retain their copyright and perhaps even pay the copyright owner when they are read. Nelson and his colleagues refer to this as transcopyright. The system is designed to manage attribution and links between references and their original source.

There are systems, such as TheBrain, that offer an object-oriented construct that does include bi-directional links, but it requires constructing the document entirely inside TheBrain. TheBrain does accommodate documents but placing them inside TheBrain turns the container into metadata. Because of the navigation constructs in TheBrain, storing documents within a topic container will likely make it more discoverable because that container exists only once within TheBrain, but it does not solve the problem of text, spreadsheets, presentations, and other documents with embedded references that cannot be easily teased from their structures.

Some purpose-built systems do manage content at the component level, such as proposal management systems that compile content into a proposal based on the latest version of various products, services, schedules, and other elements of a proposal for a company with a standard set of offers that needs to configure them uniquely for each proposal, but which, for the most part, constitute reusable components. 

While it would be easy to manage issues like modifying management principles in one place, the user would be limited to the creation of future documents and could not retroactively be applied to existing proposals.

Knowledge and content at this level should be managed much like a bill of materials. A capacitor has a part number. It can be used in a number of assemblies. A manufacturing system can easily be queried to show all of the assemblies that require that part number. The part number, the abstract representation of the part, includes lifecycle management, meaning that if it is replaced by a different component with the same functional profile, a new part number can be referenced, and all bills of material will now point to the new part, not the old part.

Unlike content, however, manufacturing systems don’t include commentary or other references that belie precision. Larger engineering systems that capture auxiliary content about parts, their performance in assemblies, quality, and other factors fall into the same traps as content management systems. Further, most of those systems, along with other structured systems like Customer Relationship Management (CRM) systems, associate content back to the original record, be it a part number or a customer number. While the underlying content may be difficult to repurpose or revise if a change is made, at least its context is clear.

How knowledge management fails us

Knowledge management purports to make knowledge more discoverable—but the levels of abstraction often sit too high above the knowledge. In the management principles example, a shift in codified knowledge—which statements constitute the principles—results in most content related to the principles requiring rewriting once discovered, and for all the reasons stated above; there is no systematic guarantee that any non-exhaustive examination of available content will retrieve all references to the principles.

Despite decades of work on tags, taxonomies, ontologies, document structures, indexing, pattern recognition and other techniques and technologies, the complexity of content—and the simple fact that most content isn’t written to be managed—become a considerable burden to those charged with the curation and refitting of ideas deeply embedded in that content. Unfortunately, none of knowledge management’s promises can be fulfilled when working at the detail level of most enterprise content repositories.

This example of content change management also demonstrates the ongoing issues related to content that moves from a managed space into a non-managed one. In today’s IT environment, it is very easy to share a source document and to, at minimum, reference the document via a URL (even if some reading the reference don’t have access because they lack access authorization) than to e-mail a copy and exacerbate the proliferation of unmanaged content.

So, what’s the answer?

As much as organizations want to rely on automation to slog through the chaos that is document creation, storage, and management, the best answer, for critical documents, is curation. The problems listed above will persist, but at least the foundational documents, those referenced by the organization’s policies and practices, onboarding material, and marketing content, will be discoverable. That will prove a heavy lift for most, however, without the discipline and systems associated with programs like spacecraft manufacturing; our more mundane, even important documents, never arrive in systems that store their components as objects and keep track of where those components get referenced.

If this becomes a serious issue for organizations going forward, they will need to once again invoke Nelson’s vision of Xanadu…a reference to a Coleridge poem that perhaps fittingly, was never completed.


For more serious insights on knowledge management, click here.

Share this post:

  • Share on X (Opens in new window) X
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Share on Facebook (Opens in new window) Facebook
  • Email a link to a friend (Opens in new window) Email
  • Print (Opens in new window) Print
  • Share on WhatsApp (Opens in new window) WhatsApp
  • Share on Bluesky (Opens in new window) Bluesky
  • More
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Pinterest (Opens in new window) Pinterest

Like this:

Like Loading…

Related

Filed Under: Knowledge Management

Reader Interactions

Leave a ReplyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Subscribe to Serious Insights

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 7,849 other subscribers

Download the 2026 State of AI Report

Amazon Associate

As an Amazon Associate, I earn from qualifying purchases.

Hit Amazon Haul for Amazing Discounts.

Also, take a look at these links for additional Amazon discounts.

Today’s Deals.
Up to 80% Off
Crazy Low-Priced Finds
Under $5
Brand Scores

Dan’s poetry. Only on Kindle. Read today!

Top Posts

  • JBL Tour Pro 2 Review: Excellent Headphones That Crush With Their NextGen Case
    JBL Tour Pro 2 Review: Excellent Headphones That Crush With Their NextGen Case
  • JLab Epic Air Sport ANC Gen 2 Review: Sports Earbuds that Go the Extra Mile
    JLab Epic Air Sport ANC Gen 2 Review: Sports Earbuds that Go the Extra Mile
  • Tozo HT2 ANC Headphones Review: Inexpensive Headphones That Impress for the Price
    Tozo HT2 ANC Headphones Review: Inexpensive Headphones That Impress for the Price
  • Jabra Elite 10 Earbuds Review: The Jabra Flagship Continues to Improve on Comfort and Features
    Jabra Elite 10 Earbuds Review: The Jabra Flagship Continues to Improve on Comfort and Features
  • 12 Hybrid Work Fears Managers Must Face
    12 Hybrid Work Fears Managers Must Face

Buy my space adventure only on Kindle.

Recent Comments

  • JBL Tour Pro 2 Review: Worth It? Specs, Comparison & More - Coastal Journal on JBL Tour Pro 2 Review: Excellent Headphones That Crush With Their NextGen Case
  • AI PCs Want Higher Labels Than AI PC – blog.aimactgrow.com on Acer Aspire 16 AI Qualcomm Review: Snapdragon X Value Laptop with Copilot+ Trade-offs
  • AI PCs Need Better Labels Than AI PC on Acer Aspire 16 AI Qualcomm Review: Snapdragon X Value Laptop with Copilot+ Trade-offs
  • OWC Thunderbolt Dock (14-Port) Review: One Dock, and One Cable, to Rule Them All on EZQuest USB-C Slim Gen 2 Hub Adapter 6-in-1 Review: A Speedy Modern Hub for Modern Work
  • Lenovo’s Qira is a Bet on Ambient, Cross-device AI—and on a New Kind of Operating System on “The Future of AI Isn’t What You Think” from Foxit Featuring a Daniel W. Rasmus Interview

Footer

Sitemap

  • Blogs
  • Book Daniel W. Rasmus
  • About Daniel W. Rasmus
  • Serious Insights LLC Disclaimer
  • Privacy Policy

Archives

Tag Cloud

ABC Apple AR artificial intelligence Big Data Buffy the Vampire Slayer BusinessWeek Cengage CIO Magazine CIOs Cisco context coronavirus Customer Service Dell Disney Disneyland earbud review Enterprise 2.0 facebook Fast Company Feedback loops Harvard Business Review HBR HP IBM Innovation Instagram iPhone case JBL Kindle Knowledge Management life-long learning Logitech Management By Design Microsoft mission statement Netflix New Scientist Nokia scenario planning Star Trek Stephen Elop Thought Leadership VR

Copyright 2009-2026 Serious Insights LLC | Log in

We are using cookies to give you the best experience on our website.

You can find out more about which cookies we are using or switch them off in .

%d
    Powered by  GDPR Cookie Compliance
    Privacy Overview

    This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.

    Strictly Necessary Cookies

    Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.