Archive for the ‘Software Dev’ Category
Why I like bzr over git and hg
Recently I posted in Twitter that my favorite DVCS is bzr. Immediately Manish asked me why not git or hg. Here is the answer:
git
I love GitHub. I use git regularly. But when compared to bzr, I love bzr more. Why? I find git’s SHA based version numbering cryptic. The default installation of git does not support command shortcuts (like st for status, ci for commit) which I am used to from my svn days (fact: this can be manually set using git alias).
hg
This choice does not have any rational backing! I did not like odd-sounding hg in the command prompt. bzr sounded more natural to me.
Encoding to Ogg Theora Format: For use in HTML5
To encode a video file for use in HTML5 video compatible browser (firefox 3.5 and above supports HTML5 video), use the command:
$ ffmpeg -i in.flv -vcodec libtheora -sameq \ -acodec libvorbis -ac 2 -sameq out.ogg
Finally use it in your page:
<video src="out.ogg"> Your browser does not support HTML5 video. </video>
Reference: HTML 5 <video> Tag.
Testing HTML
We are living in the age of Ajax and beyond, and I talk about testing HTML! But it is true. HTML needs testing. The issue with HTML is its flexibility. HTML specification does not insist on well-formedness or correctness. It allowed many different ways of doing the same thing. This paved way for its instant adoption (when the web was invented), but later created trouble in terms of complexity involved in the parser required to parse it, consistent rendering between different agents and, of course, the complexity involved in testing it.
A few days back we faced a production problem caused by HTML. The programmer had put the <option> attribute value thus:
<option value=DYNAMICALLY_GENERATED>Some content</option>
This seemingly harmless code failed when the DYNAMICALLY_GENERATED content contained space in it. The programmer did not wrap the attribute value in single-quote or double-quote. When the form was submitted, the value was taken was the part of the string before the first space. We, the developers did not test with the DYNAMICALLY_GENERATED string with spaces in it.
Another instance we faced was also similar. The programmer had left some open tag in <form>. Firefox rendered this page correctly. The form was functional. When this was deployed for customer review, the customers were appalled to see no form elements. An empty screen. The customers were using IE6.
Issues like these are caused by lack of discipline in the developer’s part. And sometimes, they creep due to duress, or genuine mistake.
When we encounter problems like these, we immediately recognize that these are problems that can be solved easily. Our XML parsers immediately complain when we face such issues when writing XML by hand. But the Web Browser is not made for this purpose. We have specialized HTML Validators to perform such routine checks and find errors of this kind.
HTML Validators
Some of the popular HTML Validators:
- W3′s validator
- WDG’s validator
- Validator by Henri Sivonen
- HTMLTidy
- Firefox Addon (not available for Linux)
Note that some of the services like WDG validator and Henri Sivonen’s validator also come with source code. You may run these services in your local network too.
Challenges in testing HTML
Dynamic content
The first challenge is the process of HTML generation itself. The steps in generating the final HTML:
code > compile > deploy > runtime inputs > final HTML generation
As you see, in most Web development methodologies the final HTML which gets generated is in the last step. This renders validation process to be a costly affair in terms of time.
Automation (with examples of Java specific tools)
As many things in programming world, we can also automate the testing of valid HTML. This can be done during the phase of integration/funtional testing (because actual HTML is generated in the runtime environment based on various dynamic user inputs and parameters).
Approach 1: Using a validation service
The first approach would be to use your functional test tool to get the HTML source, and then programatically send it to one of the validation services (validation.nu exposes its functionality as services). For example, when using Selenium as JUnit test, you would get the HTML source thus:
String htmlSource = selenium.getBodyText(); // Send htmlSource to validation service to verify its validity
Approach 2: Using JTidy inside your test suites
JTidy is a port of the immensely popular HTMLTidy tool initially written by Dave Raggett, now maintained by volunteers.
Tidy tidy = new Tidy(); tidy.parse(inputStream, System.out);
Approach 3: Use a tool that supports HTML verification
If you see the architecture of MaxQ, JTidy is integrated into the tool’s default testing workflow.
Based on the approaches discussed, choose the approach best suited to you and your environment.
Most despicable web browser: IE
I know I am not the first person to say this. I know I am neither the last. But I am using this post to ease my indignation about a pathetic piece of engineering called IE.
Recently I have been working on portlet development in Liferay. We developed a fancy portlet with Ajax and stuff showing content from a Alfresco deployment. While styling the portlet, I experienced nightmare. The HTML, CSS and JavaScript we developed worked in all browsers, except IE 7. And the reasons for not working are so pathetic: nested divs, nested styling components, menu generation using lists.
I am wondering what kind of coding and engineering effort would have gone into making this crap of a browser called IE. As I am imagining how to write code for parsing and rendering CSS display components, I can feel the awkward code that is inside IE parsing my CSS. Something like graceful degradation and granularity of CSS styles are simple to achieve by “a little thought” while coding. But the crap inside IE makes it so complicated!
I wish there never was a browser called IE.
Writing HTML
Programmers are often required to write HTML code. Recently, on reviewing such code, I found some glaring mistakes. Based on this experience, I have assembled some points which programmers should note when developing HTML.
Version of HTML
Before writing HTML, decide upon version compliance. HTML 4.01 and XHTML 1.0 are popular choices.
Specify Version of HTML as DOCTYPE
For HTML 4.01 it is:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
For XHTML 1.0:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
For a detailed list, visit: http://webdesign.about.com/od/xhtml/a/aa011507.htm
Don’t use deprecated elements like <font>
<font> has been deprecated since version 4.01 of HTML.
Use CSS for styling
For styling purposes (specifying font, color, background-color, border, etc.) use only CSS. For example, for setting the background of a page, the earlier method is:
<body bgcolor="blue">
This is better written as:
<body style="background: blue;">
Open/Close Elements
Please ensure you open and close the HTML elements in proper order. Always have the discipline of closing open elements.
Indentation
HTML is also source code which is maintained by humans. Please respect yourself and the people who will be maintaining it later: write readable HTML with proper indentation.
Validate HTML
Use a proper validation service before publishing your HTML. You may also use tools like xmllint also to validate your HTML.
Test in target browser
All our development systems are Linux. We developers test our HTMLs in Firefox. But our clients use IE. Situations like these demand additional testing effort in IE.
Forcing HTTP Download
To force HTTP download of a dynamically generated content, I usually set the HTTP header Content-Type to application/octet-stream. This forces the browser to display the Save dialog box. But this has the limitation of sending the wrong content-type even when we know the correct one. Recently I discovered another HTTP header which solves this problem. This is the Content-Disposition header. This can take following two vales:
- inline: This will render the content inline in the browser.
- attachment: This will force the browser to display the Save dialog.
When generating dynamic content, it is also recommended to specify proper filename. This file name can also be specified as a parameter to Content-Disposition header. An example:
Content-Disposition: attachment;filename=document.pdf
Content-Disposition is covered in RFC 2183.
On Pairing: By John De
John De, President of N-Brain, has to say this about pairing:
… I discovered that two heads really are better than one. Developers have different ways of interpreting code, and different ways of solving problems. Combining this rich diversity creates a strength unequaled by any single developer. Put me in a room with a junior programmer, and turn us loose on some task, and I guarantee you that I will gain new insight into the problem from this developer, and that our resulting solution will be stronger than anything I could have come up with alone.
Read full interview here: http://java.dzone.com/news/interview-john-de-goes-free-un.