Documentation
¶
Overview ¶
Package xml provides conversion between AST nodes and Go native types.
Package xml provides a user-friendly DOM API for XML manipulation.
The DOM API provides type-safe, fluent interfaces for building and manipulating XML documents without requiring type assertions or working with raw AST nodes.
Element Type ¶
Element represents an XML element with chainable methods:
elem := xml.NewElement("user").
Attr("id", "123").
Text("Alice")
Nested Structures ¶
Build complex nested XML structures fluently:
doc := xml.NewElement("user").
Attr("id", "123").
Child("name", xml.NewElement("name").Text("Alice")).
Child("email", xml.NewElement("email").Text("[email protected]"))
Package xml provides AST rendering to XML bytes.
This file implements the core XML rendering functionality, converting Shape AST nodes back into XML byte representations.
Package xml provides XML format parsing and AST generation.
This package implements a complete XML parser that parses XML data into Shape's unified AST representation.
Architecture ¶
This parser uses LL(1) recursive descent parsing (see Shape ADR 0004). The grammar is defined in docs/grammar/xml.ebnf and verified through automated tests (see Shape ADR 0005: Grammar-as-Verification).
Thread Safety ¶
All functions in this package are safe for concurrent use by multiple goroutines. Each function call creates its own parser instance with no shared mutable state.
Parsing APIs ¶
The package provides two parsing functions:
- Parse(string) - Parses XML from a string in memory
- ParseReader(io.Reader) - Parses XML from any io.Reader with streaming support
Use Parse() for small XML documents that are already in memory as strings. Use ParseReader() for large files, network streams, or any io.Reader source.
Example usage with Parse: ¶
xmlStr := `<user id="123"><name>Alice</name></user>`
node, err := xml.Parse(xmlStr)
if err != nil {
// handle error
}
// node is now a *ast.ObjectNode representing the XML data
Example usage with ParseReader: ¶
file, err := os.Open("data.xml")
if err != nil {
// handle error
}
defer file.Close()
node, err := xml.ParseReader(file)
if err != nil {
// handle error
}
// node is now a *ast.ObjectNode representing the XML data
Index ¶
- func Format() string
- func InterfaceToNode(v interface{}) (ast.SchemaNode, error)
- func Marshal(v interface{}) ([]byte, error)
- func MarshalIndent(v interface{}, prefix, indent string) ([]byte, error)
- func NodeToInterface(node ast.SchemaNode) interface{}
- func Parse(input string) (ast.SchemaNode, error)
- func ParseReader(reader io.Reader) (ast.SchemaNode, error)
- func ReleaseTree(node ast.SchemaNode)
- func Render(node ast.SchemaNode) ([]byte, error)
- func RenderIndent(node ast.SchemaNode, prefix, indent string) ([]byte, error)
- func Unmarshal(data []byte, v interface{}) error
- func Validate(input string) error
- func ValidateReader(reader io.Reader) error
- type Element
- func (e *Element) Attr(name, value string) *Element
- func (e *Element) Attrs() []string
- func (e *Element) CDATA(value string) *Element
- func (e *Element) Child(name string, child *Element) *Element
- func (e *Element) ChildText(name, text string) *Element
- func (e *Element) Children() []string
- func (e *Element) Get(key string) (interface{}, bool)
- func (e *Element) GetAttr(name string) (string, bool)
- func (e *Element) GetCDATA() (string, bool)
- func (e *Element) GetChild(name string) (*Element, bool)
- func (e *Element) GetText() (string, bool)
- func (e *Element) Has(key string) bool
- func (e *Element) HasAttr(name string) bool
- func (e *Element) Keys() []string
- func (e *Element) Remove(key string) *Element
- func (e *Element) RemoveAttr(name string) *Element
- func (e *Element) Set(key string, value interface{}) *Element
- func (e *Element) Text(value string) *Element
- func (e *Element) ToMap() map[string]interface{}
- func (e *Element) XML(elementName string) (string, error)
- func (e *Element) XMLIndent(elementName, prefix, indent string) (string, error)
- type Marshaler
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Format ¶
func Format() string
Format returns the format identifier for this parser. Returns "XML" to identify this as the XML data format parser.
func InterfaceToNode ¶
func InterfaceToNode(v interface{}) (ast.SchemaNode, error)
InterfaceToNode converts native Go types to AST nodes for XML.
Converts:
- string → *ast.LiteralNode
- int, int64, int32, etc → *ast.LiteralNode
- float64, float32 → *ast.LiteralNode
- bool → *ast.LiteralNode
- nil → *ast.LiteralNode
- []interface{} → *ast.ArrayDataNode
- map[string]interface{} → *ast.ObjectNode
- *Element → *ast.ObjectNode
This function recursively processes nested structures.
For XML structure, expects:
- Attributes as "@attrname" keys
- Text content as "#text" key
- CDATA sections as "#cdata" key
- Child elements as nested maps
Example:
data := map[string]interface{}{
"@id": "123",
"name": map[string]interface{}{"#text": "Alice"},
}
node, _ := xml.InterfaceToNode(data)
// node is an *ast.ObjectNode representing the XML structure
func Marshal ¶
Marshal returns the XML encoding of v.
Marshal traverses the value v recursively. If an encountered value implements the xml.Marshaler interface, Marshal calls its MarshalXML method to produce XML.
Otherwise, Marshal uses the following type-dependent default encodings:
Boolean values encode as XML text (true/false).
Floating point, integer values encode as XML text.
String values encode as XML text with proper escaping.
Array and slice values encode as a sequence of XML elements with the same name.
Struct values encode as XML elements. Each exported struct field becomes either an XML element or attribute, using the field name as the element/attribute name, unless the field is omitted for one of the reasons given below.
The encoding of each struct field can be customized by the format string stored under the "xml" key in the struct field's tag. The format string gives the name of the field, possibly followed by a comma-separated list of options. The name may be empty in order to specify options without overriding the default field name.
The "attr" option specifies that the field should be encoded as an XML attribute.
The "chardata" option specifies that the field contains the text content of the element.
The "cdata" option specifies that the field contains CDATA content.
The "omitempty" option specifies that the field should be omitted from the encoding if the field has an empty value, defined as false, 0, a nil pointer, a nil interface value, and any empty array, slice, map, or string.
As a special case, if the field tag is "-", the field is always omitted.
Map values encode as XML elements with map keys as element names. The map's key type must be a string; the map keys are used as XML element names.
Pointer values encode as the value pointed to. A nil pointer encodes as an empty XML element.
Interface values encode as the value contained in the interface. A nil interface value encodes as an empty XML element.
XML cannot represent cyclic data structures and Marshal does not handle them. Passing cyclic structures to Marshal will result in an error.
func MarshalIndent ¶
MarshalIndent works like Marshal but with indentation for readability. Each XML element begins on a new line starting with prefix followed by one or more copies of indent according to the nesting depth.
func NodeToInterface ¶
func NodeToInterface(node ast.SchemaNode) interface{}
NodeToInterface converts an AST node to native Go types.
Converts:
- *ast.LiteralNode → primitives (string, int64, float64, bool, nil)
- *ast.ArrayDataNode → []interface{}
- *ast.ObjectNode (array - legacy) → []interface{}
- *ast.ObjectNode (object) → map[string]interface{}
This function recursively processes nested structures.
For XML, this preserves the structure:
- Attributes as "@attrname" keys
- Text content as "#text" key
- CDATA sections as "#cdata" key
- Child elements as nested maps
Example:
node, _ := xml.Parse(`<user id="123"><name>Alice</name></user>`)
data := xml.NodeToInterface(node)
// data is map[string]interface{}{"@id":"123", "name":map[string]interface{}{"#text":"Alice"}}
func Parse ¶
func Parse(input string) (ast.SchemaNode, error)
Parse parses XML format into an AST from a string.
The input is a complete XML document with a root element.
Returns an ast.SchemaNode representing the parsed XML:
- *ast.ObjectNode for elements
- Properties prefixed with "@" for attributes
- "#text" property for text content
- "#cdata" property for CDATA sections
For parsing large files or streaming data, use ParseReader instead.
Example:
node, err := xml.Parse(`<user id="123"><name>Alice</name></user>`)
obj := node.(*ast.ObjectNode)
idNode, _ := obj.GetProperty("@id")
id := idNode.(*ast.LiteralNode).Value().(string) // "123"
func ParseReader ¶
func ParseReader(reader io.Reader) (ast.SchemaNode, error)
ParseReader parses XML format into an AST from an io.Reader.
This function is designed for parsing large XML files or streaming data with constant memory usage. It uses a buffered stream implementation that reads data in chunks, making it suitable for files that don't fit entirely in memory.
The reader can be any io.Reader implementation:
- os.File for reading from files
- strings.Reader for reading from strings
- bytes.Buffer for reading from byte slices
- Network streams, compressed streams, etc.
Returns an ast.SchemaNode representing the parsed XML:
- *ast.ObjectNode for elements
- Properties prefixed with "@" for attributes
- "#text" property for text content
- "#cdata" property for CDATA sections
Example parsing from a file:
file, err := os.Open("data.xml")
if err != nil {
// handle error
}
defer file.Close()
node, err := xml.ParseReader(file)
if err != nil {
// handle error
}
// node is now a *ast.ObjectNode representing the XML data
func ReleaseTree ¶
func ReleaseTree(node ast.SchemaNode)
ReleaseTree recursively releases all nodes in an AST tree back to their pools. This should be called when you're completely done with an AST (after conversion, rendering, etc.) to enable node reuse and reduce memory pressure.
Example:
node, _ := xml.Parse(`<user id="123"><name>Alice</name></user>`) data := xml.NodeToInterface(node) xml.ReleaseTree(node) // Release nodes back to pool
func Render ¶
func Render(node ast.SchemaNode) ([]byte, error)
Render converts an AST node to compact XML bytes.
The node should be the result of Parse() or ParseReader(). Returns XML bytes with no unnecessary whitespace.
The XML structure uses Shape's conventions:
- Properties prefixed with "@" are attributes
- Property "#text" contains text content
- Property "#cdata" contains CDATA sections
- Other properties are child elements
Example:
node, _ := xml.Parse(`<user id="123"><name>Alice</name></user>`) bytes, _ := xml.Render(node) // bytes: <user id="123"><name>Alice</name></user>
func RenderIndent ¶
func RenderIndent(node ast.SchemaNode, prefix, indent string) ([]byte, error)
RenderIndent converts an AST node to pretty-printed XML bytes with indentation.
The prefix is added to the beginning of each line, and indent specifies the indentation string (typically spaces or tabs).
Common usage:
- RenderIndent(node, "", " ") - 2-space indentation
- RenderIndent(node, "", "\t") - tab indentation
- RenderIndent(node, ">>", " ") - prefix each line with ">>"
Example:
node, _ := xml.Parse(`<user id="123"><name>Alice</name></user>`) bytes, _ := xml.RenderIndent(node, "", " ") // Output: // <user id="123"> // <name>Alice</name> // </user>
func Unmarshal ¶
Unmarshal parses the XML-encoded data and stores the result in the value pointed to by v.
This function uses a high-performance fast path that bypasses AST construction for optimal performance (4-5x faster than AST path). If you need the AST for advanced features, use Parse() followed by NodeToInterface().
Unmarshal uses XML struct tags to map XML elements and attributes to struct fields:
type User struct {
ID string `xml:"id,attr"` // Attribute
Name string `xml:"name"` // Child element
Bio string `xml:",chardata"` // Text content
}
To unmarshal XML into an interface value, Unmarshal stores a map[string]interface{} representation:
- "@attrname" for attributes
- "#text" for text content
- "#cdata" for CDATA sections
- "childname" for child elements
func Validate ¶
Validate checks if the given string is valid XML. It uses the fast parser for efficient validation without AST construction.
Returns nil if the input is valid XML. Returns an error with details about why the XML is invalid.
This is the idiomatic Go approach - check the error:
if err := xml.Validate(input); err != nil {
// Invalid XML
fmt.Println("Invalid XML:", err)
}
// Valid XML - err is nil
For validating large files or streaming data, use ValidateReader instead.
func ValidateReader ¶
ValidateReader checks if the XML from an io.Reader is valid. It uses the fast parser for efficient validation without AST construction.
This function is designed for validating large XML files or streaming data without loading the entire content into memory.
Returns nil if the input is valid XML. Returns an error with details about why the XML is invalid.
Example validating from a file:
file, err := os.Open("data.xml")
if err != nil {
// handle error
}
defer file.Close()
if err := xml.ValidateReader(file); err != nil {
// Invalid XML
fmt.Println("Invalid XML:", err)
}
// Valid XML - err is nil
Types ¶
type Element ¶
type Element struct {
// contains filtered or unexported fields
}
Element represents an XML element with a fluent API for manipulation. All setter methods return *Element to enable method chaining.
func NewElement ¶
func NewElement() *Element
NewElement creates a new Element. The element name is not stored in the Element itself but is used when rendering. This is just a data container following the XML AST convention.
func ParseElement ¶
ParseElement parses XML string into an Element with a fluent API. Returns an error if the input is not valid XML.
func (*Element) Attr ¶
Attr sets an attribute and returns the Element for chaining. Attributes are stored with "@" prefix following XML AST convention.
func (*Element) CDATA ¶
CDATA sets CDATA content and returns the Element for chaining. CDATA content is stored as "#cdata" following XML AST convention.
func (*Element) Child ¶
Child adds a child element and returns the parent Element for chaining. The name is the element name (e.g., "name", "email").
func (*Element) ChildText ¶
ChildText adds a child element with text content and returns the parent Element for chaining. This is a convenience method equivalent to Child(name, NewElement().Text(text)).
func (*Element) Children ¶
Children returns names of all child elements (excluding attributes and text/cdata).
func (*Element) GetAttr ¶
GetAttr gets an attribute value. Returns empty string and false if not found.
func (*Element) GetCDATA ¶
GetCDATA gets the CDATA content. Returns empty string and false if not found.
func (*Element) GetChild ¶
GetChild gets a child element. Returns nil and false if not found or wrong type.
func (*Element) GetText ¶
GetText gets the text content. Returns empty string and false if not found.
func (*Element) RemoveAttr ¶
RemoveAttr removes an attribute and returns the Element for chaining.
func (*Element) Text ¶
Text sets the text content and returns the Element for chaining. Text content is stored as "#text" following XML AST convention.
func (*Element) XML ¶
XML marshals the Element to an XML string with the given element name.
Example:
elem := NewElement().Attr("id", "123").Text("Alice")
xml, _ := elem.XML("user")
// Returns: <user id="123">Alice</user>
func (*Element) XMLIndent ¶
XMLIndent returns a pretty-printed XML string representation with indentation. The prefix is written at the beginning of each line, and indent specifies the indentation string.
Common usage:
- XMLIndent("user", "", " ") - 2-space indentation
- XMLIndent("user", "", "\t") - tab indentation
Example:
elem := NewElement().
Attr("id", "123").
ChildText("name", "Alice")
pretty, _ := elem.XMLIndent("user", "", " ")
// Output:
// <user id="123">
// <name>Alice</name>
// </user>