[PATCH] D89071: [SystemZ/z/OS] Add GOFFObjectFile class and details of GOFF file format

Wed Nov 11 09:38:22 PST 2020

kpn added inline comments.

================
Comment at: llvm/lib/Object/GOFFObjectFile.cpp:49-50
+
+  const uint8_t *End = reinterpret_cast<const uint8_t *>(Data.getBufferEnd());
+  for (const uint8_t *I = base(); I < End; I += GOFF::RecordLength) {
+    uint8_t RecordType = (I[1] & 0xF0) >> 4;
----------------
yusra.syeda wrote:
> jhenderson wrote:
> > kpn wrote:
> > > jhenderson wrote:
> > > > yusra.syeda wrote:
> > > > > jhenderson wrote:
> > > > > > As noted earlier - it might be better to use the `DataExtractor` and `Cursor` class to make parsing easier.
> > > > > The DataExtractor class doesn't seem to be helpful. It's best use is if the data is read sequential, which is not the case with GOFF.
> > > > You can use `DataExtractor` with offsets, rather than a `Cursor`, if the read is jumping around.
> > > Does this mean the ability to read RECFM=VB GOFF datasets is explicitly being designed to be impossible?
> > I'm not sure if this comment is being directed at me or @yusra.syeda. If at me, I don't really understand the question as I don't know the file format.
> The first goal is to get the compiler running in USS, which only supports fixed block length of 80.
> 
Hmm, @yusra.syeda, your response is mostly correct, but not entirely.

It's true that the Unix-style filesystem (IBM calls it the "Hierarchical File System", with the first implementation of it being called the "Hierarchical File System" and the second "z/FS") has no record support because Unix doesn't support records in files. Thus the 80-byte requirement on GOFF record sizes. This is true.

But there's no requirement that a program started under USS only access the Unix-style filesystem. There's no requirement that a program started under TSO or in batch only access traditional MVS datasets. Indeed, JCL even has support for Unix paths in DD statements, and TSO probably does as well (but I don't have my book handy).

So the compiler "running in USS" does _not_ mean that we are restricted to 80-byte GOFF records. Granted, disambiguation of Unix paths and MVS dataset names is a problem, but still.

I understand why you would want to leave variable sized records for implementation later. Not implementing support for MVS datasets up front is one thing. Designing your code to make it difficult if not impossible to add later is quite different. For example, random access to variable size record datasets is painful. 

Are you at least looking ahead to adding RECFM=V support later?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89071/new/

https://reviews.llvm.org/D89071