Problem Description
A utility program to sort the HTML code into
increasing date sequence for a webpage that has logical sections separated
by date fields .
Background & Techniques
The What's New items on DFF pages suffer from a problem that
most authors of incrementally updated web pages (or other files) must answer:
"Should I add new information at the top, the way that frequent reader would
expect it, or at the bottom, the way that makes more sense when reading the
file as an archive? Or some combination?" Bloggers and forum administrators face the
same questions. I haven't researched how others answer the
question, but my answer has been to maintain the What's New section of the
home page with new entries at the top. After a couple of months,
oldest entries are moved to a month dated file in the "What's New" folder,
indexed by home page entries. Manually sorting items within these
files from descending to ascending date sequence is a time consuming and
error prone process.
This program attempts to automate the process. Although only
a few archived WhatsNew files have been sorted as test cases so far, my
intention is to sort them all. I have had to make one small
change in how the items are formatted. Those with associated
images must have the image inserted after the date header.
Background & Techniques
Background & Techniques
The What's New items on DFF home page and in the "WhatsNew folder
suffer from a problem that most authors of incrementally updated web pages
(or other files) must answer: "Should I add new information at the top, the
way that frequent reader would expect it, or at the bottom, the way that
makes more sense when reading the file as an archive? Or some combination?"
Bloggers and forum administrators face the same questions. I haven't
researched how others answer the question, but my answer has been to
maintain the What's New section of the home page with new entries at the
top. After a couple of months, oldest entries are moved to a month dated
file in the "What's New" folder, indexed by home page entries. Manually
sorting items within these files from descending to ascending date sequence
is a time consuming and error prone process usually only undertaken for the
quarterly DFF newsletters.
This program attempts to automate the process. Although only a few archived
WhatsNew files have been sorted as test cases, my intention is to sort them
all.
I have had to make one small change in how the items are formatted.
Those with associated images must have the image inserted after the date
header. Currently many images have been inserted before the date header with
and "align=left" or "align=right" parameter which causes the following text
to display beside the image. It's an effect I like, but I haven't quite
figured out how to detect this case and keep the image with its associated
text while sorting occurs.
Non-programmers are welcome to read on, but may
want to skip to the bottom of this page to download
executable version of the program.
Programmer's Notes
To implement the sorting, I copy the HTML file to a Tstringlist,
InList, and create a TEntryObj object to hold the date, the
starting line number, and the number of lines in this section which is
attached as an Object to the InList entry which contains the
date field. Two special TEntryObj objects with date years of 1950 and
2050 define the beginning of the lines preceding the first date section and
the lines following the last date section. The "header"
object is attached to the first entry and the "trailer" entry is defined by
the line containing a user defined string, which so far has been the same in
every test case.
Rather than fill in line count information as the lines are added,
a second pass is made through InList. In this pass each entry object
is extracted and added to another TStringlist, EntryList, containing
entries only for the InList entries that have the objects. It
is an easy matter to update the line count information in the objects as
they are added to EntryList. The current record number minus
the starting record number from the previous EntryList entry is the
line count for the previous object. The key string for EntryList
items is a "yyyymmdd" formatted date string which can be sorted in
ascending date sequence using the standard TStringlist Sort method.
A final pass is made through EntryList, copying lines as
specified by each record's TEntryObj, now sorted in ascending date
sequence, from InList to a third TStringlist OutList.
OutList is then saved. saved to the specified sorted file name on disk.
One other new area was the handling of directories in this program.
I initialize a directory named "Sorted Files" in the same directory as the
program using Delphi's DirectoryExists t function to see if the
folder exists and procedure MkDir to create it if it does not.
Also, a new UBrowseFolder unit contains a BrowseForFolder
function lifted from Brian Cryer's webpage at
http://www.cryer.co.uk/brian/delphi/howto_browseforfolder.htm. It
uses the SHGetPathFromIDList function defined as part of the Windows
API Shell namespace.
Running/Exploring the Program