I've been happily hacking away at my managed Scheme implementation. Just a quick progress report. I'm ~50% done with my parser, a hand-crafted LR(1) top-down predictive kind of guy. I'm realizing now that this might have been a suboptimal choice as the Scheme grammar is fairly ambiguous at times and results in quite a bit of backtracking. E.g. take a look at how many freaking nonterminals start with '(' that can appear interchangably... Once I get this more baked, I might consider alternative algorithms as part of the optimization phase.
For those who are curious, I'm using this as my primary language reference.
My object model looks like this thus far:
Lexical tokens:
+ Token
- EosToken
- BooleanToken
+ IdentifierToken
- VariableToken
- KeywordToken
- LiteralToken
- NumberToken
- CharacterToken
- StringToken
Parse tree nodes:
+ TreeNode
+ ExpressionNode
+ LiteralExpressionNode
- QuotationExpressionNode
- SelfEvaluatingExpressionNode
- ProcedureCallExpressionNode
- LambdaExpressionNode
- ConditionalExpressionNode
- AssignmentExpressionNode
- DerivedExpressionNode
- MacroUseExpressionNode
- MacroBlockExpressionNode
- FormalsNode
- BodyNode
- SeuqneceNode
+ DatumNode
- SimpleDatumNode
+ CompoundDatumNode
- ListDatumNode
- AbbreviationDatumNode
- VectorDatumNode
The latter, as previously stated, is only ~50% complete. Still happy with progress thus far... Got a looooooooonng way to go.