Tagged languages like HTML or XML are different from conventional programming languages in that the punctuation (tags) are either very numerous (as in HTML) or a user-definable set (as in XML). Further, tags can often have parameters. Suggest how to divide the following HTML document:

Here is a photo of <B>my house</B>:
<P><IMG SRC = "house. gif"><BR>
See <A HREF = "morePix. htmll">More Pictures</A> if you
liked that one. <P> 

into appropriate lexemes. Which lexemes should get associated lexical values,and what should those values be?  

