Working with XML documents in Ruby with REXML

Ruby provides a set of libraries that allow Ruby developer to work with XML documents. However, before you begin working with XML documents you need to validate them. Thankfully, Ruby provides a method to accomplish this task with relative ease.

valid_xml? Is such a method. You can use in order to validate xml. It returns parsed xml document which is ready for you to use.

require “rexml/document”
def valid_xml?(xml)
  begin
   REXML::Document.new(xml)
  rescue REXML::ParseException
 end
end

Ruby allows you to parse XML document and put it into Ruby data structure. In order to accomplish this we need to execute the following line of code

require “rexml/document”
my_xml = REXML::Document.new(my_xml_doc)

We can navigate to the root of this XML file via root method and iterate via each element with the help of each_element method. Both of these methods belong to Element class.

Here is an example how these methods can be utilized to print entire shallow level XML document

my_xml.root.each_element do |node1|
 node1.each_element do |node2|
  if node2.has_elements?
    node.each_element do |child|
     puts “#{child.name}”
    end
  end
end

Other useful method of parsing a large XML document is to parse it via REXML::Documenmt.parse_stream. We don’t load entire document into memory and parse it for specific part we want to interrogate.

Ruby XML parser provides developer with Xpath methods to navigate XML documents. Some of the often used methods are:

REXML::XPath.first(xmlDoc, ‘//name’)
REXML::XPath.match(xmlDoc, ‘//[@title=”Mr”]’)

REXML::XPath.each(xmlDoc,’//name’) do |person|
  puts “#{person}”
end

In addition, there are third party gems that allow you to parse XML documents into Hashes. Example of such a gem is xmlsimple.

gem ‘xmlsimple’
require ‘xmlsimple’
require ‘pp’

doc = XmlSimple.xml_in xml
pp doc

XmlSimple is not a brand new approach to parsing an  XML, it is in fact uses Document class and then parses XML into hash and array for ease of access and use.

In conclusion, REXML is so much more than it comes to XML domain. It can help you create new xml documents, modify and delete nodes, compress white spaces, perform replacements among other things.

Featured pages

Ruby

Set of Ruby Object Oriented Programming Language tutorials that cover such topics as Ruby strings, …

Rails

Rails Framework tutorial teaches you how to utilize de facto framework of choice for Ruby developme…

Ruby Duck Typing

“If an object quacks like a duck just go ahead and treat it as a duck” – this fun…

Regular Expressions

Ruby uses the =~ operator to check any given string against regular expression. For example, a stri…

Credit Card Number

Every last digit of any credit card is a check sum digit that is determined by all digits in front …

Ruby Arrays

Ruby Programming Language has built in support for Arrays. Arrays help you define some of the compl…

Ruby Hashes

Hashes are very similar to arrays in Ruby and hashes interface is similar to Ruby array interface. …

Ruby Code Block

Ruby is very unique language when it comes to code blocks. You can simply pass a code block to a me…