Skip to main content

XPATH and XPATH Expressions In XMLLINT


XPATH And XPATH Expressions


Earlier, I told you about xmllint and xmllint for html files.  Let's say you just want to parse the <span> tags within your html file or just your <span lang="el"> tags?

Enter:  Xpath.

Xpath is yet another option available within the xmllint language. Remember, an Xpath is used to navigate through elements and attributes in xml and html documents.  Xpath uses Xpath Expressions to select nodes or node sets within a document.

Example 1.  Looking for all of the <span> tags within an html document.

xmllint --html --xpath "//span" StedmanLesson10.html

xmllint = This tells the command line that we are going to be using the xmllint language.

space = because we always have space in between commands

-- = Remember, these are the two hyphen-minus characters that we need to tell the command line that we are going to use an xmllint option.

html = This is the xmllint option we want to use because our file is an html file. 

space

-- = We are using yet another option, so we need these before the next option we are using.

xpath = Xpath is the other xmllint option we are going to use.  Why?  Because we want to write an xpath expression that tells the command line that we ONLY want ALL of the span tags (<span>) within this document.   So we must first tell the command line that we are going to write an xpath expression. 

xpath expression --->      The expression will be contained in quotation marks ("").  

// = These double slashes mean ALL.  We want ALL of the span tags, so we need to type this first.  

span = this is the name of the tag that we want to parse

space

StedmanLesson10.html = The name of the file you want to parse goes here. 

See the picture below to see what this command looks like once executed: 


Notice the command on the first line........you see the xmllint language followed by the html option, then the xpath option. Then the xpath expression is given followed by the html file name.  Everything under that line is what the xmllint parsed.  It returned ONLY the <span> tags (including the span tags with attributes) within the StedmanLesson10.html file. 

Pretty cool huh?

Now, what if we want the xmllint to be more defined?  What if we just want the span tags with the lang attribute that has a value of "el"?  Then we would type the command like this: 

xmllint --html --xpath "//span[@lang='el']" StedmanLesson10.html


As you can see, the attribute for the span tag is typed within square brackets and begins with the @ character.  The value of that attribute, "el" is placed within SINGLE quotes.  Then you have your closing square bracket and closing quotation mark.

When I give this command, it returns the following:


As you can see, only the <span lang="el"> tags have been parsed within this document.

This is just one example of how you can get really defined with xmllint for html using xpath expressions.  


 Follow me as I learn to build my website bit by bit!    IronTreeDev.com


Photo by Caleb Jones on Unsplash

Comments

Popular posts from this blog

XMLLINT for HTML: Cleaning up the HTML Code

Getting That MESS Cleaned Up! In an earlier post, we learned about xmllint .  Today, I want to talk about cleaning up the code for an HTML file. When we have an xml file, xmllint is used.  For an html file, we use the following command in the command line: xmllint --html <filename goes here>         Here is an actual command on my command line for running xmllint for my StedmanLesson10.html file. In the photo above, you see that I start off the command with xmllint. The next thing is a space and then a --html.  The two -- are two hyphen-minus characters that are used to specify long options (Basically there are options that can be used within xmllint.  The two hyphen-minus characters are saying an option is going to be used.  In this case, that option will be html - because we are going to do an xmllint on an html file). After that you see a space and then the name of the html file I want t...

GIT Commands: Pull, Add, Commit, Push.....and sometimes Revert

Pull, Add, Commit, Push ...... then Repeat We talked about GIT, GIT Hub , and the three different areas GIT works in.  Now it is time to talk about some of the most used GIT commands.   I am currently working with a team on a project called Stedman.  We are all using the GIT Hub server to track our work as we make changes to the files within the Stedman project.   Many changes can be made by other members of the team, from the time I sign off to the time I sign back on.  How do I get the changes they made to a file I need to work on, onto my computer? ENTER GIT COMMANDS : 1.  The first thing I do is goto my command window on my computer and navigate to the directory where the file I want to work in is located (this will be a directory that is being tracked by GIT).   *If you need a quick refresher:  cd file_name   will take you through each of your directories until you reach your desired loca...