Duplicates: Files vs Records & Why You Need to Know the Difference

Within each database, and eventually, each enterprise content management (ECM) system, businesses must manage the limits of storage. Relational databases are filled with countless records and files; unfortunately, many of those are duplicated, which take up much-needed storage space within your ECM environment.

First, a quick rundown of terminology:

File Management: Daily activities involving your business’ physical or digital files (e.g., capture, storage, modification, and sharing). File management focuses on:

  • Organization and faster search of existing documents
  • Reducing lost or misfiled documents
  • Improving processes and efficiencies
  • Reducing space needed to store documents

Records Management: Policies and standards for maintaining diverse types of records, focused on:

  • Creating a files inventory
  • Establishing retention periods (how long to store files)
  • Managing files disposition
  • Develop and implement records policies and procedures

We all understand intuitively that duplicates are a significant issue in most organizations, but like many aspects of information governance, solving it is not so simple. With files, we must consider the following.

#1 Indiscriminate Deletion
A policy analyst might work on a position paper in isolation and save that document in their “section” of a shared drive or ECM. The paper is then submitted to a management committee for review or approval, creating two copies of that document: the working copy and the “official” copy. At this point, the working copy can be deleted because the copy submitted to committee would take precedence, but it is not inconceivable that the working copy has a newer system date. Indiscriminately deleting either version based on date introduces risk to the organization.

#2 Access Control
People often create copies when they want to collaborate or submit information for peer review, but not all collaborators or reviewers work in the same technical environment; whether it is a volume on a shared drive or in an ECM system. In this scenario, an author emails a document to a number of peers, and they each save a copy. If we delete all duplicates across all repositories, people without access to the specific, remaining copy lose their document.

#3 Migration
This scenario is the corollary of the access control scenario. In some cases, everyone in an organization has access to content in a legacy system, and files are migrated into a new environment. Management may want to take this opportunity to apply access controls by segregating content into different volumes and designating access to each one. Again, indiscriminate file deletion may restrict access to those who need it in the new environment.

These same issues exist in records management, just on a larger scale. Imagine the deletion of an entire customer record with hundreds of associated files, or the inability for your team to access and collaborate on records across the enterprise. The same problems associated with file management magnify to larger scales, which introduce greater risks to your organization.