Geog 388
DLG to .shp Write
Due in about a week


The objective is to become familiar with the content of both USGS DLG files and shapefiles, and to consider the strengths of the former and limits of the latter.


The assignment is to transcribe data (just three line records) from a DLG file to .shp file format, manually, and then answer a few questions about what could be in the .shp file.

The USGS Digital Line Graph (DLG) format is documented online here , but for this exercise consider just these three excerpted "line" records:

L     2     5     2     2     6                 2    0    0
    532812.91  4233413.06   532773.94  4247282.79   
L     3     2     6     1     2                 2    0    0
    532757.10  4247282.79   539496.77  4247314.04
L     4     6     7     1     5                 2    0    0
    539496.77  4247314.04   542795.89  4247330.85
Each line record starts with an "L" which is followed by eight numbers:
  1. the line's ID number,
  2. from node,
  3. to node,
  4. area left,
  5. area right,
  6. the number of points in the line (2 in each of these),
  7. the number of major and minor attribute code pairs (0 in these), and
  8. the number of characters of text data that follows (0 in these).
which are followed by as many coordinate pairs (here, UTM Eastings and Northings, i.e. "x" and "y") as were indicated by item 6 above. Notice that the first five numbers on the "L" record embody the full topological structure of the dataset and that all you really need for the shapefile are the coordinates.

ESRI's Shapefiles are documented in an ESRI white paper, ESRI Shapefile Technical Description which you should examine. Perhaps even study.

Once you have your head wrapped around what is in a shapefile .shp file, fill-in this table with the contents for the .shp file (like the example below) that would represent the three lines above; indicating what content should be in each byte position in the file. I.e., lay-out the content for the file header, and for the the three record header - record content pairs, like this, (but with the right numbers):
Position (bytes)ValueData-type
0 - 39994integer
4 - 230integer
24 - 27lengthinteger
28 - 311000 integer
etc See table 1.p. 4 white paper
96 - 99 0integer
100 - 1031 Why? See Table 2. integer
104 - 107content lengthinteger
...See Table 6. p. 8 white paper
(n-16) - (n-8)x-coordas double
(n-8) - (n)y-coordas double

(Here's an aside that most of you should ignore, but if all of this does not have you thoroughly confused already and you've noted that we are ignoring the switching between little endian and big endian order that ESRI sprinkled in this format, you might want to ponder why anyone would do that.)

When you've got the content down...

Answer these questions...

  1. What is the maximum allowable file length for a valid .shp file?
  2. What is the maximum number of parts allowed in a polygon in a shapefile?
  3. Can/does a shapefile encode topology?
  4. Where does a shapefile encode minimum bounding rectangles (MBRs)?

Some Notes on numbers in Shapefiles

Basic structure of a .shp file:
File Header
Record Header
Record Contents
Record Header
Record Contents
... continuing...
Record Header
Record Contents

Files are sequences of bytes taken and interpreted in groups.

Shapefiles (.shp) use two digital representations of numbers:

The value for the file length in the file header is the total length of the file, including the header, counted in 16-bit words.

The content length in each record header excludes the eight bytes of the record header, so it is just the length of the content part of the record, specified as the count of 16-bit words.

Polygons close with the first point explicitly repeated as the last. That is, a triangle shape will have four points in its array.


ESRI. 1995. ArcView Shapefile Technical Description Available as part of ESRI's "ArcView White Paper Series" on line at

ESRI. 1998. Shapefile Technical Description Available as part of ESRI's "ArcView White Paper Series" on line at

USGS. 1984. USGS Digital Cartographic Data Standards: Digital Line Graphs from 1:24,000-scale Maps, US Geological Survey Circular 895-C.