moab
moab::FileTokenizer Class Reference

Parse a file as space-separated tokens. More...

#include <FileTokenizer.hpp>

List of all members.

Public Member Functions

 FileTokenizer (std::FILE *file_ptr, ReadUtilIface *read_util_ptr)
 constructor
 ~FileTokenizer ()
 destructor : closes file.
const char * get_string ()
 get next token
bool get_newline ()
 check for newline
bool get_doubles (size_t count, double *array)
 Parse a sequence of double values.
bool get_floats (size_t count, float *array)
 Parse a sequence of float values.
bool get_integers (size_t count, int *array)
 Parse a sequence of integer values.
bool get_long_ints (size_t count, long *array)
 Parse a sequence of integer values.
bool get_short_ints (size_t count, short *array)
 Parse a sequence of integer values.
bool get_bytes (size_t count, unsigned char *array)
 Parse a sequence of integer values.
bool get_binary (size_t bytes, void *mem)
 Read binary data (interleaved with ASCII)
bool get_booleans (size_t count, bool *array)
 Parse a sequence of bit or boolean values.
bool eof () const
int line_number () const
void unget_token ()
bool match_token (const char *string, bool print_error=true)
int match_token (const char *const *string_list, bool print_error=true)

Private Member Functions

bool get_double_internal (double &result)
bool get_long_int_internal (long &result)
bool get_boolean_internal (bool &result)
bool get_float_internal (float &result)
bool get_integer_internal (int &result)
bool get_short_int_internal (short &result)
bool get_byte_internal (unsigned char &result)

Private Attributes

std::FILE * filePtr
ReadUtilIfacereadUtilPtr
char buffer [512]
char * nextToken
char * bufferEnd
int lineNumber
char lastChar

Detailed Description

Parse a file as space-separated tokens.

Author:
Jason Kraftcheck
Date:
30 Sept 2004

Read a file, separating it into space-separated tokens. This is provided in place of using the standard C or C++ file parsing routines because it counts lines, which is useful for error reporting. Also provides some useful utility methods for parsing VTK files (which is the intended use of this implementation.)

Uses raw reads/writes, implementing internal buffering. Token size may not exceed buffer size.

Definition at line 43 of file FileTokenizer.hpp.


Constructor & Destructor Documentation

moab::FileTokenizer::FileTokenizer ( std::FILE *  file_ptr,
ReadUtilIface read_util_ptr 
)

constructor

Parameters:
file_ptrThe file to read from.
read_util_ptrPointer to ReadUtilIface to use for reporting errors.

Definition at line 27 of file FileTokenizer.cpp.

  : filePtr( file_ptr ),
    readUtilPtr( rif_ptr ),
    nextToken( buffer ),
    bufferEnd( buffer ),
    lineNumber( 1 ),
    lastChar( '\0' )
  {}

destructor : closes file.

The destructor closes the passed file handle. This is done as a convenience feature. If the caller creates an instance of this object on the stack, the file will automatically be closed when the caller returns.

Definition at line 36 of file FileTokenizer.cpp.

  { fclose( filePtr ); }

Member Function Documentation

bool moab::FileTokenizer::eof ( ) const

Check for end-of-file condition.

Definition at line 39 of file FileTokenizer.cpp.

  { return nextToken == bufferEnd && feof(filePtr); }
bool moab::FileTokenizer::get_binary ( size_t  bytes,
void *  mem 
)

Read binary data (interleaved with ASCII)

Read a block of binary data.

Parameters:
bytesNumber of bytes to read
memMemory address at which to store data.

Definition at line 467 of file FileTokenizer.cpp.

{
    // if data in buffer
  if (nextToken != bufferEnd) {
      // if requested size is less than buffer contents,
      // just pass back part of the buffer
    if (bufferEnd - nextToken <= (int)size) {
      memcpy( mem, nextToken, size );
      nextToken += size;
      return true;
    }
    
      // copy buffer contents into memory and clear buffer
    memcpy( mem, nextToken, bufferEnd - nextToken );
    size -= bufferEnd - nextToken;
    mem = reinterpret_cast<char*>(mem) + (bufferEnd - nextToken);
    nextToken = bufferEnd;
  }
  
    // read any additional data from file
  return size == fread( mem, 1, size, filePtr );
}
bool moab::FileTokenizer::get_boolean_internal ( bool &  result) [private]

Internal implementation of get_Booleans

Definition at line 248 of file FileTokenizer.cpp.

{
    // Get a token
  const char *token = get_string( );
  if (!token)
    return false;
  
  if (token[1] || (token[0] != '0' && token[0] != '1'))
  {
    readUtilPtr->report_error( 
      "Syntax error at line %d: expected 0 or 1, got \"%s\"",
      line_number(), token );
    return false;
  }

  result = token[0] == '1';
  return true;
}
bool moab::FileTokenizer::get_booleans ( size_t  count,
bool *  array 
)

Parse a sequence of bit or boolean values.

Read the specified number of space-delimited values.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 334 of file FileTokenizer.cpp.

{
  for (size_t i = 0; i < count; ++i)
  {
    if (!get_boolean_internal( *array ))
      return false;
    ++array;
  }
  return true;
}
bool moab::FileTokenizer::get_byte_internal ( unsigned char &  result) [private]

Internal implementation of get_bytes

Definition at line 200 of file FileTokenizer.cpp.

{
  long i;
  if (!get_long_int_internal( i ))
    return false;
  
  result = (unsigned char)i;
  if (i != (long)result)
  {
    readUtilPtr->report_error( "Numberic overflow at line %d.", line_number() );
    return false;
  }
  
  return true;
}
bool moab::FileTokenizer::get_bytes ( size_t  count,
unsigned char *  array 
)

Parse a sequence of integer values.

Read the specified number of space-delimited ints.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 289 of file FileTokenizer.cpp.

{
  for (size_t i = 0; i < count; ++i)
  {
    if (!get_byte_internal( *array ))
      return false;
    ++array;
  }
  return true;
}
bool moab::FileTokenizer::get_double_internal ( double &  result) [private]

Internal implementation of get_doubles

Definition at line 130 of file FileTokenizer.cpp.

{
    // Get a token
  const char *token_end, *token = get_string( );
  if (!token)
    return false;
  
    // Check for hex value -- on some platforms (e.g. Linux), strtod
    // will accept hex values, on others (e.g. Sun) it wil not.  Force
    // failure on hex numbers for consistancy.
  if (token[0] && token[1] && token[0] == '0' && toupper(token[1]) == 'X')
  {
    readUtilPtr->report_error(
      "Syntax error at line %d: expected number, got \"%s\"",
      line_number(), token );
    return false;
  }
  
  
    // Parse token as double
  result = strtod( token, (char**)&token_end );

    // If the one past the last char read by strtod is
    // not the NULL character terminating the string,
    // then parse failed.
  if (*token_end)
  {
    readUtilPtr->report_error(
      "Syntax error at line %d: expected number, got \"%s\"",
      line_number(), token );
    return false;
  }
  
  return true;
}
bool moab::FileTokenizer::get_doubles ( size_t  count,
double *  array 
)

Parse a sequence of double values.

Read the specified number of space-delimited doubles.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 278 of file FileTokenizer.cpp.

{
  for (size_t i = 0; i < count; ++i)
  {
    if (!get_double_internal( *array ))
      return false;
    ++array;
  }
  return true;
}
bool moab::FileTokenizer::get_float_internal ( float &  result) [private]

Internal implementation of get_floats

Definition at line 166 of file FileTokenizer.cpp.

{
  double d;
  if (!get_double_internal( d ))
    return false;
  
  result = (float)d;
  return true;
}
bool moab::FileTokenizer::get_floats ( size_t  count,
float *  array 
)

Parse a sequence of float values.

Read the specified number of space-delimited doubles.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 267 of file FileTokenizer.cpp.

{
  for (size_t i = 0; i < count; ++i)
  {
    if (!get_float_internal( *array ))
      return false;
    ++array;
  }
  return true;
}
bool moab::FileTokenizer::get_integer_internal ( int &  result) [private]

Internal implementation of get_integers

Definition at line 232 of file FileTokenizer.cpp.

{
  long i;
  if (!get_long_int_internal( i ))
    return false;
  
  result = (int)i;
  if (i != (long)result)
  {
    readUtilPtr->report_error( "Numberic overflow at line %d.", line_number() );
    return false;
  }
  
  return true;
}
bool moab::FileTokenizer::get_integers ( size_t  count,
int *  array 
)

Parse a sequence of integer values.

Read the specified number of space-delimited ints.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 312 of file FileTokenizer.cpp.

{
  for (size_t i = 0; i < count; ++i)
  {
    if (!get_integer_internal( *array ))
      return false;
    ++array;
  }
  return true;
}
bool moab::FileTokenizer::get_long_int_internal ( long &  result) [private]

Internal implementation of get_long_ints

Definition at line 176 of file FileTokenizer.cpp.

{
    // Get a token
  const char *token_end, *token = get_string( );
  if (!token)
    return false;
  
    // Parse token as long
  result = strtol( token, (char**)&token_end, 0 );

    // If the one past the last char read by strtol is
    // not the NULL character terminating the string,
    // then parse failed.
  if (*token_end)
  {
    readUtilPtr->report_error(
      "Syntax error at line %d: expected integer, got \"%s\"",
      line_number(), token );
    return false;
  }

  return true;
}
bool moab::FileTokenizer::get_long_ints ( size_t  count,
long *  array 
)

Parse a sequence of integer values.

Read the specified number of space-delimited ints.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 323 of file FileTokenizer.cpp.

{
  for (size_t i = 0; i < count; ++i)
  {
    if (!get_long_int_internal( *array ))
      return false;
    ++array;
  }
  return true;
}

check for newline

Consume whitespace up to and including the next newline. If a non-space character is found before a newline, the function will stop, set the error message, and return false.

Returns:
True if a newline was found before any non-space character. False otherwise.

Definition at line 415 of file FileTokenizer.cpp.

{
  if (lastChar == '\n')
  {
    lastChar = ' ';
    ++lineNumber;
    return true;
  }
  
    // Loop until either we a) find a newline, b) find a non-whitespace
    // character or c) reach the end of the file.
  for (;;)
  {
      // If the buffer is empty, read more.
    if (nextToken == bufferEnd)
    {
      size_t count = fread( buffer, 1, sizeof(buffer), filePtr );
      if (!count)
      {
        if (eof())
          readUtilPtr->report_error( "File truncated at line %d.", line_number() );
        else
          readUtilPtr->report_error( "I/O Error" );
        break;
      }
      
      nextToken = buffer;
      bufferEnd = buffer + count;
    }
    
      // If the current character is not a space, the we've failed.
    if (!isspace(*nextToken))
    {
      readUtilPtr->report_error( "Expected newline at line %d.", line_number() );
      break;
    }
      
      // If the current space character is a newline,
      // increment the line number count.
    if (*nextToken == '\n')
    {
      ++lineNumber;
      ++nextToken;
      lastChar = ' ';
      return true;
    }
    ++nextToken;
  }
  
  return false;
}
bool moab::FileTokenizer::get_short_int_internal ( short &  result) [private]

Internal implementation of get_short_ints

Definition at line 216 of file FileTokenizer.cpp.

{
  long i;
  if (!get_long_int_internal( i ))
    return false;
  
  result = (short)i;
  if (i != (long)result)
  {
    readUtilPtr->report_error( "Numberic overflow at line %d.", line_number() );
    return false;
  }
  
  return true;
}
bool moab::FileTokenizer::get_short_ints ( size_t  count,
short *  array 
)

Parse a sequence of integer values.

Read the specified number of space-delimited ints.

Parameters:
countThe number of values to read.
arrayThe memory at which to store the values.
Returns:
true if successful, false otherwise.

Definition at line 300 of file FileTokenizer.cpp.

{
  for (size_t i = 0; i < count; ++i)
  {
    if (!get_short_int_internal( *array ))
      return false;
    ++array;
  }
  return true;
}

get next token

Get the next whitespace-delimited token from the file. NOTE: The returned string is only valid until the next call to any of the functions in this class that read from the file.

Returns:
A pointer to the buffer space containing the string, or NULL if an error occurred.

Definition at line 42 of file FileTokenizer.cpp.

{
    // If the whitepsace character marking the end of the
    // last token was a newline, increment the line count.
  if (lastChar == '\n')
    ++lineNumber;
  
    // Loop until either found the start of a token to return or have
    // reached the end of the file.
  for (;;)
  {
      // If the buffer is empty, read more.
    if (nextToken == bufferEnd)
    {
      size_t count = fread( buffer, 1, sizeof(buffer) - 1, filePtr );
      if (!count)
      {
        if (feof(filePtr))
          readUtilPtr->report_error( "File truncated at line %d\n", line_number() );
        else
          readUtilPtr->report_error( "I/O Error\n" );
        return NULL;
      }
      
      nextToken = buffer;
      bufferEnd = buffer + count;
    }
    
      // If the current character is not a space, we've found a token.
    if (!isspace(*nextToken))
      break;
      
      // If the current space character is a newline,
      // increment the line number count.
    if (*nextToken == '\n')
      ++lineNumber;
    ++nextToken;
  }
  
    // Store the start of the token in "result" and
    // advance "nextToken" to one past the end of the
    // token.
  char* result = nextToken;
  while (nextToken != bufferEnd && !isspace(*nextToken))
    ++nextToken;
  
    // If we have reached the end of the buffer without finding
    // a whitespace character terminating the token, we need to
    // read more from the file.  Only try once.  If the token is
    // too large to fit in the buffer, give up.
  if (nextToken == bufferEnd)
  {
      // Shift the (possibly) partial token to the start of the buffer.
    size_t remaining = bufferEnd - result;
    memmove( buffer, result, remaining );
    result = buffer;
    nextToken =  result + remaining;
    
      // Fill the remainder of the buffer after the token.
    size_t count = fread( nextToken, 1, sizeof(buffer) - remaining - 1, filePtr );
    if (!count && !feof(filePtr))
    {
      readUtilPtr->report_error( "I/O Error\n" );
      return NULL;
    }
    bufferEnd = nextToken + count;
    
      // Continue to advance nextToken until we find the space
      // terminating the token.
    while (nextToken != bufferEnd && !isspace(*nextToken))
      ++nextToken;
  
    if (nextToken == bufferEnd) // EOF
    {
      *bufferEnd = '\0';
      ++bufferEnd;
    }
  }
  
    // Save terminating whitespace character (or NULL char if EOF).
  lastChar = *nextToken;
    // Put null in buffer to mark end of current token.
  *nextToken = '\0';
    // Advance nextToken to the next character to search next time.
  ++nextToken;
  return result;
}
int moab::FileTokenizer::line_number ( ) const [inline]

Get the line number the last token was read from.

Definition at line 178 of file FileTokenizer.hpp.

{ return lineNumber; }
bool moab::FileTokenizer::match_token ( const char *  string,
bool  print_error = true 
)

Match current token to passed string. If token doesn't match, set error message.

Definition at line 362 of file FileTokenizer.cpp.

{
    // Get a token
  const char *token = get_string( );
  if (!token)
    return false;

    // Check if it matches
  if (0 == strcmp( token, str ))
    return true;
  
    // Construct error message
  if (print_error)
    readUtilPtr->report_error( "Syntax error at line %d: expected \"%s\", got \"%s\"",
                                line_number(), str, token );
  return false;
}  // namespace Mesquite
int moab::FileTokenizer::match_token ( const char *const *  string_list,
bool  print_error = true 
)

Match the current token to one of an array of strings. Sets the error message if the current token doesn't match any of the input strings.

Parameters:
string_listA NULL-terminated array of strings.
Returns:
One greater than the index of the matched string, or zero if no match.

Definition at line 381 of file FileTokenizer.cpp.

{
    // Get a token
  const char *token = get_string( );
  if (!token)
    return false;

    // Check if it matches any input string
  const char* const* ptr;
  for (ptr = list; *ptr; ++ptr)
    if (0 == strcmp( token, *ptr ))
      return ptr - list + 1;
  
  if (!print_error)
    return false;
  
    // No match, constuct error message
  std::string message( "Parsing error at line " );
  char lineno[16];
  sprintf( lineno, "%d", line_number() );
  message += lineno;
  message += ": expected one of {";
  for (ptr = list; *ptr; ++ptr)
  {
    message += " ";
    message += *ptr;
  }
  message += " } got \"";
  message += token;
  message += "\"";
  readUtilPtr->report_error( message );
  return false;
}

Put current token back in buffer. Can only unget one token.

Definition at line 345 of file FileTokenizer.cpp.

{
  if (nextToken - buffer < 2)
    return;
  
  --nextToken;
  *nextToken = lastChar;
  --nextToken;
  while (nextToken > buffer && *nextToken)
    --nextToken;
    
  if (!*nextToken)
    ++nextToken;
    
  lastChar = '\0';
}

Member Data Documentation

char moab::FileTokenizer::buffer[512] [private]

Input buffer

Definition at line 227 of file FileTokenizer.hpp.

One past the last used byte of the buffer

Definition at line 232 of file FileTokenizer.hpp.

std::FILE* moab::FileTokenizer::filePtr [private]

Pointer to standard C FILE struct

Definition at line 221 of file FileTokenizer.hpp.

The whitespace character marking the end of the last returned token. Saved here because if it is a newline, the line count will need to be incremented when the next token is returned.

Definition at line 242 of file FileTokenizer.hpp.

Line number of last returned token

Definition at line 235 of file FileTokenizer.hpp.

One past the end of the last token returned

Definition at line 230 of file FileTokenizer.hpp.

Pointer to MOAB ReadUtil Interface

Definition at line 224 of file FileTokenizer.hpp.


The documentation for this class was generated from the following files:
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Defines