[RFC] EON: A simpler configuration format
April 23, 2022
Do you like YAML's relative readability but dislike all the ways a user could shoot themselves in the foot? If you're like me, you probably agree that it's not the user's job to remember that the file format they're using automatically converts true
, on
, yes
, or y
to a boolean TRUE[1]. In other words, the semantics of the token true
is not inherent to that token, but is determined by the field you're currently configuring.
For example, say you want a list of bash commands, where each command is represented as a list, with commands and arguments separated out:
- [true]
- [yes]
- [chmod, 0700, foo.txt]
- [git, add, *.txt]
This currently has the following issues:
- The first command would be the boolean
, not the Bash commandtrue
- The second command would also be the boolean
, not the Bash commandyes
- Depending on your YAML library implementation, the third command may parse
as the number700
- The last command would throw an error, since
starts a YAML alias, and.
is not a valid character in a YAML alias.
As the user, this is very much violating the principle of least surprise. As the developer, I can't do anything about this, since YAML does the type-conversion automatically before I get the value at all. So there's absolutely nothing I could do to get the user's actual input (e.g. I have no way of knowing if the user typed in yes
); the onus is on the user to quote the value.
Introducing the Easy Object Notation format
EON files are suffixed with .eon
and should generally work with YAML syntax highlighting. Conceptually, EON parses to a JSON object[2] containing objects, lists, strings, and nothing else. EON also provides a standard specification for other types, and EON implementations should provide parsers out-of-the-box for these types, but conceptually, all other types are parsed from strings.
This allows the user to write x: true
, and the developer to determine whether that true
is a boolean or a string. The core intuition here is that you generally aren't decoding data in a vacuum; you're generally deserializing some object that knows the types it wants for the fields. If you've ever decoded enums from JSON, you're probably already doing this extra step anyway.
This is still a rough draft, but it covers all the major points. If there's even a modicum of interest in this, I might put up an actual spec.
Objects are basically the same as YAML: can use either indentation-based keys or
syntax- Duplicate keys is an error
- Order not guaranteed
- Keys may optionally be quoted, to allow a key to include characters like
Lists are also basically the same as YAML:
syntax for lists -
All scalars are initially strings. Developers decide how to deserialize strings per-field.
- There would be some first-class support for multiline strings, without needing quotes or
long_description: |
syntax or anything. - Scalars may optionally be quoted, to allow a string to start with
or something
- There would be some first-class support for multiline strings, without needing quotes or
EON implementations must provide parsers out-of-the-box for basic types:
- Boolean:
(case insensitive) map to TRUE and FALSE - Integer: arbitrary precision, supports binary, octal, hex
- Float: arbitrary precision, supports scientific notation
- Null:
should map to the language's idiomatic representation of NULL - Future: datetime types like TOML?
- Boolean:
API examples
Some potential API ideas for parsing the given config:
str_field: hello world
multi_field: { a: 1.5, b: 2.2 }
- 1
- 2
state: ON
required_nullable: null
class Config:
str_field: str
optional_bool: bool | None
multi_field: float | dict[str, float]
list_field: list[int]
baz: Baz
def __decode_eon__(cls, v: eon.Value) -> Config:
o = v.object()
return Config(
# helper equivalent to: lambda v: v.float()
class Baz:
state: State
required_nullable: str | None
def __decode_eon__(cls, v: eon.Value) -> Baz:
o = v.object()
return Baz(
class State(Enum):
ON = "ON"
def __decode_eon__(cls, v: eon.Value) -> State:
s = v.string()
return cls[s.upper()]
except KeyError:
raise eon.DecodeError(f"Invalid State: {s}")
def main():
cfg = eon.loads(s, Config)
type Config = {
str_field: string
optional_bool?: boolean
multi_field: number | Record<string, number>
list_field: number[]
baz: Baz
type Baz = {
state: State
required_nullable: string | null
type State = 'ON' | 'OFF'
const decodeConfig = (v: eon.Value): Config => {
const o = v.object()
return {
str_field: o.str_field.string(),
optional_bool: o.optional_bool?.boolean(),
multi_field: o.multi_field.oneOf(
// helper equivalent to: (v) => v.float()
list_field: o.list_field.list_of(eon.integer()),
baz: o.baz.decodeWith(decodeBaz),
const decodeBaz = (v: eon.Value): Baz => {
const o = v.object()
return {
state: o.state.decodeWith(decodeState),
required_nullable: o.required_nullable.oneOf(
const decodeState = (v: eon.Value): State => {
const s = v.string()
switch (s) {
case 'ON':
case 'OFF':
return s
throw new eon.DecodeError(`Invalid State: ${s}`)
const main = () => {
const cfg = eon.parse(s).decodeWith(decodeConfig)
data Config = Config
{ strField :: Text
, optionalBool :: Maybe Bool
, multiField :: Either Double (Map Text Double)
, listField :: [Int]
, baz :: Baz
deriving (Show)
data Baz = Baz
{ state :: State
, requiredNullable :: Maybe Text
deriving (Show)
data State = ON | OFF
deriving (Show)
instance FromEON Config where
parseEON = parseObject "Config" $
-- the type to parse a field as is determined
-- automatically by FromEON instance resolution
<$> parseField "str_field"
<*> parseField "optional_bool"
<*> parseField "multi_field"
<*> parseField "list_field"
<*> parseField "baz"
instance FromEON Baz where
parseEON = parseObject "Baz" $
<$> parseField "state"
<*> parseField "required_nullable"
instance FromEON State where
parseEON = parseText >>= \case
"ON" -> pure ON
"OFF" -> pure OFF
s -> fail $ "Invalid State: " <> show s
main :: IO ()
main = either (error . show) print $ Eon.decode s
These are some random musings I have that would be nice to include in the spec, but I'm not tied to them.
I would probably enforce that lists are indented one more level, e.g.
a_list: - a - b - c
and disallow
a_list: - a - b - c
since it annoys me that there are two ways, and two developers editing the same file might be inconsistent with the indentation
Smart quotes have bitten people before when copy-pasting, would probably want to require smart quotes or apostrophes to be quoted, always
Comparison to other formats
- XML is a bit verbose for me, and very difficult to read. The line between attributes and subnodes are a bit fuzzy, and it's annoying to have to repeat the whole tag name to open and close the node. One thing that XML shares with EON, though, is that it also treats node contents as strings, with type deserialization being defined in the application.
- JSON is really good for transferring data between machines, but is not good for user generation + consumption. It doesn't support comments, it doesn't support trailing commas, and, while unambiguous in terms of what things parse to, is relatively hard to read.
- YAML is rather nice, but the spec includes way too many features, and has some confusing behaviors. Feature-wise, it supports anchors (making
reserved characters), type-defining pragmas (which led to arbitrary code execution in the initial Python implementation), and automatic value conversions ([x,y,z]
=>["x", true, "z"]
), which are way too featureful, A. for what I want to support in a config file, and B. for library implementers to be completely spec-compliant. - TOML is really good for flat-ish configs, but if you want to nest lists and objects a few levels, TOML really starts to stretch to its limit.
- Typed configuration languages like Dhall or Nickel have a lot of syntax, are a bit difficult to learn, and, similar to YAML, is a bit too featureful than I need/want.
Yes, obligatory XKCD:
I think all of the above formats are really good for certain things, but I think the one thing currently lacking is a readable format (like YAML) with very simple semantics (unlike YAML). EON fills this gap, and introduces a new concept of a configuration format that doesn't force users to know about types. If you have a field where you really need to allow the user to specify if they mean the number 1
or the string "1"
, then sure, go with YAML. But I would imagine 90% of the time, the user doesn't need the distinction, which is where EON's simplicity shines over YAML.
Next steps
What are your thoughts? Would you find this format useful? Vote and/or comment here!
Huge thanks to Paul Craddick + Greg Lorence for fleshing out some of these thoughts with me.
Yes, YAML 1.2 only allows
now, but does your YAML parsing library using 1.2? If your library useslibyaml
, you're still on 1.1 ↩︎Actual implementation may avoid the intermediate JSON object for performance reasons. But this is the conceptual model that one should have. ↩︎