
When naming files for general office use, is there still a need to use an underscore, hyphen, or CamelCase? I understand the original purpose was due to how different operating systems saw file names, but in today’s world where “almost” all files are stored and retrieved in a Windows NTFS system, and/or a document management system like OpenText or SharePoint, is there any reason not to just use a space between the words? And I do understand that if you are uploading to a web site, you will get the %20 for each space, but I am talking about the average office document stored in the office network drive or a document management system.——————————
Bud Porter-Roth
Principal Consultant
Porter-Roth Associates
——————————
Hi Bud;
No, there is no longer a need to use underscores, dashes, etc., especially when storing in an ECM environment such as opentext or hyland. On the other hand, storing mission critical data on network drives is a REALLY BAD idea as you have no protections that you get with a trustworthy storage environment Consider what happened recently to Baltimore with the ransomware attack, and there have been many others, and what we have found is that those organizations that had a trustworthy storage environment, the trustworthy storage sub-systems were able to protect the content, even though the network drives got encrypted. The important point is protect your mission critical data, don’t rely on IT security methods only, or other protocols related to the OS rather than the ECM environment, as these technologies are used to store and manage the content throughout its life-cycle.
The original purpose to adding dashes, underscores, etc. was during the late ’90’s the versions of windows available couldn’t handle filenames that didn’t have a contiguous prefix and required all filenames to be ‘7 bit’ along with other limitations. This was also the time when storage was exceptionally expensive and organizations were only starting to consider using ECM for storing content other than engineering files. This resulted in many people using the underscores, dashes,etc. so that the file could be stored in windows. In relation to using ECM technologies, there is no value as the filename itself is no longer relevant, but rather usually becomes an additional descriptor loaded into ECM (if desired) and used to add optional information along with the file into ECM, using index values to enable users to locate previously stored information and not rely on file naming conventions by the users. This approach also supports the ability to ingest the content into a trustworthy storage sub-system (TSS) where the information is protected from malware and other malicious attacks. This is accomplished through the implementation of a trustworthy environment. We (the ECM and Digital Transformation Industry Standards program) have finished preparing several industry standards that will be of value to you and your clients such as ISO 18829 (ECM Assessments), ISO 15801 (Trustworthy Content Management) and ISO 22957 (ECM Design and implementation). I highly recommend you get these and review them. The program is also in the process of preparing several ISO (international) best practices related to planning and implementing Digital Transformation and Trustworthy ECM Environments.
For more information please contact either the ECM Standards program Director, Betsy Fanning at betsy.fanning @ 3dpdfconsortium.org or myself (I am the chair of a few of the committees) at blatt @ eid-inc.com I hope this helps.
——————————
Robert Blatt, MIT, LIT, CHPA-III
Principal Consultant, Electronic Image Designers (EID).
AIIIM Fellow #175
Chair, Trustworthy Storage
Chair, Trustworthy Document Management & Assessment
Chair, ECM Implementation Guidelines
ISO Convenor: 18829, 18759, 22957, 18759)
US Delegate to ISO TC/171
TC/171 Liaison Officer to TC46 SC11
TC/171 Liaison Officer to TC/272
——————————
Robert, good info, as usual, and thanks………I will check out the standards mentioned. Bud
——————————
Bud Porter-Roth
Principal Consultant
Porter-Roth Associates
——————————
In my experience, extended characters are usually an issue in long term conversion strategies. I had an EMC conversion move from 30k to 300k based on the conversion of extended characters moving from one system to the next. I think a good naming convention and training in the long run creates a much smoother long term solution. IMO.
Carl, thanks. Migration of files, especially ones with non-standards names is problematic. Bud
——————————
Bud Porter-Roth
Principal Consultant
Porter-Roth Associates
——————————
In fact it’s the opposite – CamelCase, hyphens and underscores can be outright harmful, as they may not be discovered by search (where no “clean” version of the name exists in the metadata otherwise).
* Underscore _ is treated the same as a space, except that it is also searchable itself. So if the filename is “Master_Agreement.doc”, it can be found by master, agreement or even pin-pointed by searching for master_agreement (no quoting needed).
* Hyphen is also treated as a space but it’s also a special character meaning “not”, so searching for it specifically needs quotes. Thus “Master-Agreement.doc” can be found by master or agreement as above, but searching for master-agreement (without quotes) will not find it, as it will specifically leave out anything with the word agreement!
Thus, space is preferable, underscores are a good alternative option, hyphens should be avoided (at least in SharePoint). And The same applies to metadata!
——————————
Pauli Visuri
Consulting Director
Sharepoint City
——————————
Pauli, thanks for the mini-tutorial. I don’t think most people would even think about how these three naming conventions affect search. Bud
——————————
Bud Porter-Roth
Principal Consultant
Porter-Roth Associates
——————————
Good point, but I would say this speaks more to the software than naming convention. Tools like Hyland, and OpenText do not have these deficiencies.
So I suppose the important thing is to understand how your users work, store, and search. Choose the right tools and know how naming convention may limit.
Thanks,
-Rick——————————
Richard Molique
ECM Consultant
IQ Business Group, Inc.
804-614-6445
rmolique@…
——————————
Richard –
I think that every tool/software has deficiencies. However, search (by default) searching only at the start of the word is a side effect of the way that search works — in whatever tool. When we need to solve for searching in the middle of strings we have to play some games. I detail these in a post I did about wildcarding @ https://www.thorprojects.com/blog/archive/2016/10/05/search-wildcarding-front-to-back-and-back-to-front/ — I speak in terms of both full-text and SQL searching to make it more understandable.
I’m not 100% certain about Pauli’s note about a hyphen. I know that it used to be that hyphens were treated as NOT only when they were preceded by white space but I suppose that it’s possible that the behavior changed. It would be odd, however, because there’s a relative standard about the behavior of punctuation like hyphen and quotes.
With that out of the way, I don’t believe file naming conventions are necessary any longer — or effective. Most search engines optimize for titles and support full-text. With that plus metadata I rarely see people searching by file name.
Rob
——————————
Robert Bogue
President
Thor Projects LLC
——————————
Cheers,
-Rick
——————————
Richard Molique
ECM Consultant
IQ Business Group, Inc.
804-614-6445
rmolique@…
——————————
Example: 20190625 AIIM Presentation by Gray The next time there is a presentation it will fall under this one.
I hope this helps a bit.
Regards,——————————
Rhonda Hazlett
Corporate Document Administrator
Olin Corporation
——————————
Rhonda, thanks and an important point if you want files to sort in chron order. And if you do manual versions, as in, V01, V02, etc., you should also use the “0” to allow for correct sorting. Bud
——————————
Bud Porter-Roth
Principal Consultant
Porter-Roth Associates
——————————
Excellent thought. Thank you
![]() |
Rhonda Hazlett
Olin Corporation Document Mgmnt Administrator
3855 North Ocoee Street Cleveland, TN 37312 O +1-423-336-4053 |
The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this message and any attachments in error and that any review, dissemination, distribution, copying or alteration of this message and/or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately by electronic mail, and delete the original message.
Rhonda, Bud, it is not good practice to add dates or version numbers to filenames, except in very specific situations.
Consider the following:
* Files are frequently updated after they have been created. Thus the date will be wrong, unless the user goes to the trouble of changing it, after saving.
* The same applies to version numbers.
I have had to review tens of thousands of files as part of migrations, and in more than a third of all cases any dates or version numbers in filename have been incorrect.
What’s more, in Content Management systems
* Renaming is often even harder than on plain disk storage
* The system handles the versioning, which will be totally messed up if the filename changes in CMS’s such as SharePoint where the name is used as the identifier for the item.
These are my rules of thumb which I teach to users switching to Content Management:
– Filenames are forever. They must not have anything that would need to be changed if the file changes.
– Any variable information such as dates, version or status should be put in the file’s metadata
There are a few exceptions to the above, e.g.
– where a date is an intrinsic part of the file’s identifier – e.g. in a news article of something that happened on a specific date (this was actually the case in Rhonda’s example).
– where an edited file becomes a new and independent “Release” or “Issue”, instead of just an updated version, and the previous releases will have equal status to the new one. This is the case in for example policies or regulations, where past ones will still continue to apply.
Finally – where there IS a date in the file name, consider using the international format, “2019-06-25” instead of “20190625” or any localised format. This will be understood by search engines, and far easier to handle in any migration situation.
——————————
Pauli Visuri
Consulting Director
Sharepoint City
——————————
Paul,
Agree with everything you said (heartily!), except the very last point about the dates. I would strongly recommend underscores versus the dashes in the date as the dashes could be interpreted by some storage systems, and more likely by search providers, as a special character, thus being potentially problematic.
Lorne – yes I know, and in fact that is my general advice to users as well (avoid dashes, use spaces or underscores instead).
However, the standard ISO8601 date format specifies dashes.
At least SharePoint search is able to tell this apart from a special character in searching, it recognises the format and understands it’s a date.
——————————
Pauli Visuri
Consulting Director
Sharepoint City
——————————
I personally see having a restriction around using spaces in a file name as an artificial restriction that doesn’t recognize the way that collaborative users create documents. They typically want a meaningful, friendly file name and unless we are somehow making the ‘mistake’ of trying to embed meta data in the file name, it should be irrelevant to the storage platform being used to manage the document whether the name contains, in particular, spaces. I have seen restrictions on other characters that can impact delivery – certain special characters used in html for example….
——————————
Peter Rahalski, CIP
Information Solutions Architect
EXCELLUS HEALTH PLANS
——————————
Good Morning All,
With the introduction of lengthy file names and ‘anything goes’ for a file name, we are finding many fail the migration/conversion process due to special characters in the file name. There is something to be said of days of old ‘8.3’ file names, where no spaces or special characters were allowed and the file extension always followed the period. We provided training for many years on naming conventions, encouraging no spaces and no special characters other than underscore and hyphen. We promote and encourage the use of YYYYMMDD date format and where versioning or other sequential numbering is relevant, to always zero-fill to aid in uniformity and sort order.
Due to special characters being problematic in migration, conversion and preservation, when we receive a batch of files, one of our steps in readying the files for preservation is to run a file renaming utility, such as File Renamer or Bulk Utility to discover, and remove, special characters. We search for ALL special characters, including those we accept (underscore and hyphen) using this expression: [][!”#$%&'()*+,./:;<=>?@\^_`{|}~-]
After learning of the potential issues with the use of hyphen in SharePoint, we may need to reconsider.
I would caution anyone to build file naming rules or guidelines around the ‘software’ or ‘system’ in use on today’s ‘flavor of the day’. Instead, realize that with the rapid technology changes, what is in use today was not in use 10 years ago and likely will not be in use 10 yrs from now. In order to bridge all systems, and to plan for migration and preservation, stick to the basics of a meaningful file naming convention; similar to the one mentioned earlier in this string (I believe it was something like YYYYMMDD_MeaningfulFileName_Author_Version) or other similarly simplistic but meaningful name in your area of business.
——————————
Linda Avetta
Digital Archives and Records Division Chief
PA State Archives
Pennsylvania Historical & Museum Commission
——————————

Robert Bogue
President
Thor Projects LLC
——————————

Robert Bogue
President
Thor Projects LLC
——————————
Again………thanks to everyone who contributed.
Bud——————————
Bud Porter-Roth
Principal Consultant
Porter-Roth Associates
——————————
Hi Bud,
I put one I have up on my site if you want to take a look:
https://ariaconsulting.net/resources-blog
Thanks Bud – that would be great.
Supplying two attachments –
1) NamingConventions. This is used for specific naming conventions for assets in the Archives. So, although they may have valuable information for us, they may be over the top for you all … it’s still food for thought. The underscores provide the various field breaks. Parsing the data with this separator is very easy to do to create indexes, lists, etc. It works very well for our institutional holdings. But may not work for your environments.
2) NamingConventions_PresentationExcerpt. This was used for initial training for staff. We sometimes use it as a refresher for new hires who will be working on scanning projects. We should probably train ALL new hires, because as someone else noted, as soon as folks start doing things their own way, then no one can decipher what’s what.
As you can see, both are very dated, but the general concepts remain and are followed today. On the presentation excerpt, you will note plenty of screen shots that are dated (including the format types which should be updated such as .docx, .xlsx, pdfa, etc.) Some of the slides may not be comprehendible without the trainer – a few are exercises for the participants (to identify what is correct, and to rename properly those that are incorrect). But either way you’ll still get the gist.
Use whatever you feel relevant. If you have any questions, please ping me.
Linda
Linda Avetta | Division Chief | CGCIO
Digital Archives & Records Division
Pennsylvania State Archives
PA Historical & Museum Commission
The Commonwealth of Pennsylvania
1825 Stanley Drive | Harrisburg, PA 17103
Phone: 717.705.6923
Visit us on the web at PHMC.state.pa.us and DigitalArchives
Attachment(s)


Linda, thanks for sending me the 2 files, I appreciate it very much. I did a quick look and they look good. Bud
——————————
Bud Porter-Roth
Principal Consultant
Porter-Roth Associates
——————————
Share this:
- Click to share on Facebook (Opens in new window)
- Click to share on Twitter (Opens in new window)
- Click to share on LinkedIn (Opens in new window)
- Click to share on WhatsApp (Opens in new window)
- Click to share on Skype (Opens in new window)
- More
- Click to print (Opens in new window)
- Click to share on Telegram (Opens in new window)
- Click to email this to a friend (Opens in new window)
- Click to share on Reddit (Opens in new window)
- Click to share on Pocket (Opens in new window)
- Click to share on Pinterest (Opens in new window)
- Click to share on Tumblr (Opens in new window)