Archive

Archive for July, 2011

Long live Java!

I just realized that Java 7 *finally* is GA! A preeminent milestone in the history of Java given to us in the middle of summertime.

This is the first release since.. uhh.. 2006? Such a relief. All the fuzz and Oracle’s acquisition of Sun has been causing a lot of pain for Java developers, but now it seems that all of us can finally move forward.

Congratulations and salute to everybody who made this possible. The future looks bright :-D Im already looking forward to Java 8 and modularization.

I have updated Open Config documentation to recommend use of Java 7… now its time to play around with it. Yay!

Categories: business, Java Tags:

Let there be guiding light

I have spent some time writing a foundation for Open Config Developer Guidelines. At the moment this is a very early draft summary of my own experience, gained through years as a software developer. A lessons learned if you will. I do hope for more people helping me on this mission ;-)

Why are developer guidelines necessary? At first it may seem as a smart-ass/dictator thing to do, trying to write people on their nose. This is far from the truth.

First, coding is a team effort. Developer guidelines are the pride and identity of the team, explaining their philosophy for collaboration and getting things done.

Second, the power lies in hands of the team, free for developers to adopt their own styles and change these guidelines together, as a team. There is no such thing as “dictatorship” in this regard, tyranny and arrogance pretending to be a democracy will turn people off.

The purpose of developer guidelines is to encourage the team to maintain a certain level of consistency and quality throughout all aspects of the project. Having different ways and opinions of achieving similar things cause confusion, inconsistency, and in the end, segregation within the team. This must be avoided, the team cannot afford loosing focus and good contributors in frustration and agony.

Do notice the emphasis on “guidelines” here, there will always be exceptions to every rule. Developers are expected to be thoughtful enough to care and take appropriate action (within reasonable boundaries) when these do not apply, maybe providing notice to the team of anomaly or ambiguity.

Individuals should not be afraid their opinions. But sometimes this cause strong philosophical disagreement to occur within the team. When this happens, the team accomplish consensus through voting. Individuals are expected to show good judgement resolving such conflicts, providing relevant justificating arguments from their point of view. If the conflict is resolved favoring a change of direction, individuals +1 voting for the proposal are expected to help making the change a reality.

The satisfaction of creativity is an energising motivator not to be underestimated. The team should feel proud of their creations, not frustrated by overly-constrainting rules.

This is work in progress and I sincerely invite anyone to participate. Comments and feedback on latest Open Config Developer Guidelines are more than welcome!

Categories: documentation, open-config, principles Tags:

Open Config website

The Open Config website is live and kicking on http://openconfig.deephacks.org!

Categories: Java, open-config Tags:

Look ma, Usain Bolt

Ever had performance problems? Yeah me too. If my manager screams “faaaaster” one more time, i will have hearing impairment for the rest of my life. BTW, did i sense a german pronunciation in all that noise? ;-)

Can you believe that there are still people doing ignorant trash talk about the garbage collector (get it?) and performance of the JVM..

i will go back writing C again so i dont have to worry about performance

*sigh*

The JVM is continuously improving its collector algorithms and highly sophisticated optimizations are incorporated into the compiler with every release (and have been doing so for the last 10 years). Do *you* really expect to have the experience, ability and time to write better and more optimized C code than some of the smartest people on this earth?

Pleeeeease..

If you are like me and 99.99 percent rest of us, you be wise to forget about C. Just get over it. (salute to all hardcore C programmers, do not feel provoked)

As much as us developers love abstractions, we cannot deny the fact that they are inherently leaky. Hardware *does* matter. The trend of processor count and memory growth make shared memory thread concurrency a lot harder. Locking, context switching and thread scheduling can make your throughput equal to syrup, thinking that pouring more threads into your shiny new super-beefy-machine will somehow magically will give you more performance. It probably will to some degree, but that’s not my point.

So what to do? I do not claim to be a performance expert, i am not, but i have some practical advice that at least helped me squash some nasty performance bugs in the past.

  1. Write clean and “dumb” code. Consider making your classes immutable, they are thread-safe hence no need for synchronization and can be cached with confidence that object values do not change after being created. Immutability also leads to code that is easier to understand. Do not try to outsmart the JVM with premature optimization tricks.


    Donald Knuth said: “Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.

  2. Spend some time on understanding the how different garbage collectors works. The information is a bit scattered, but its out there. Find the resource sharing sweet-spot between garbage collection and your application. Generally speaking larger heaps means garbage collector needs to work harder (stealing more CPU cycles) and pauses will be longer, but less frequent. In my experience you cannot avoid stop-the-world pauses from happening, even using CMS, because eventually your heap will be fragmented as swiss cheese, and boom, memory fragmentation failure. The good news is that JDK7 will probably include a new low-pause collector, called G1, which can potentially fully avoid stop-the-world pauses. UPDATE: See Grabage-First Collector(G1) in Java 7.
  3. .


  4. Always use java.util.concurrency by default when programming. Read the Java Memory Model and Thread Specification. It will help you understand why your code may not performing as it should. There are very good books on the subject of concurrency aswell:


  5. Chances are that you are dealing with legacy code (you cannot influence) that have coarse grained synchronization, causing high thread contention. Using CPU affinity with multiple JVM processes on the same machine can help reduce contention for hot locks.

  6. If you think you found JVM performance problems by doing benchmarks, first, make sure you know that you *know* your measurements are accurate. If you try measure something, dont measure other stuff. Ignoring this advice may mislead you from where the real problems lurk. So make sure to properly isolate system parts before you start measuring.

For example, if you suspect thread contention, have a look at ThreadInfo or try jstat and look for sun.rt._sync_ContendedLockAttempts.
[java]jstat -J-Djstat.showUnsupported=true -snap PID | grep _sync_[/java]

There are so much to say on this subject, but i dont have time to write more right now. Happy coding!

Categories: Java, performance, principles Tags:

Docbook

Documentation is something that most developers fear/hate and some of us usually avoid it til the end of a project. And im not talking about, the more mandatory, javadoc. *Ahem*

However, open-config need a presentation sketch for readers (and potential contributors) to learn about the project, allowing them to form an immediate first impression.

I dont want (and cant) document everything before even getting started. But the plan is to have a clear and concise mission statement, brief list of planned features, the most fundamental development guidelines and principles, high level design and something about project governance. I will do this before start coding, because I do not want to worry about it (too much) when im “in the zone” .

I recently found out about docbook, which btw is highly praised by the open source community, im actually a bit surprised i did not stumble on it in earlier. Docbook allow writing content decoupled from the presentation-view, using XML files. These XML files can then be transformed into navigable HTML pages, PDF, XHTML, man pages etc. Sweet! The syntax is a bit complex but since it is based on regular XML i can probably use an editor, with autocompletion and nice GUI, to help me out.

Docbook can also be easily integrated into maven (well, i had to spend a few hours on it) using docbkx. This is important because documentation must be tracked by strict version control, a wiki is far too weak in this regard. Having documentation in the same GitHub repository as code makes it easy to update since commits can include documentation changes and thus remain up-to-date with latest development. Documentation can also be branched along with code releases.

When changes are made, simply upload the latest maven-generated HTML and PDF to the project website.

The most important project artifacts can now stay in (almost) perfect sync, no matter what medium readers wish to view it on. No more excuses for doing late documentation… yikes! ;-)

Categories: documentation, open-config, principles Tags:

Pimp My Tests!

July 15th, 2011 1 comment

Arquillian is awesome!

Some years back i created a test framework that was intended to test my code in the JEE server, since im not a fan of testing in simulated containers. There are simply too many faults i could not guarantee happening after shipping my code further down the pipeline (yeah, I dont want ops nagging me for developing shitty code – and i also want to save them the trouble). So I really like the idea of testing my code using a full software stack almost identical to the production environment. And turn-around to needs to be really fast!

At that time I did not know of any test frameworks that was able to do in-container testing. Sure, there was Cactus (much respect) that i tried around 2004, but it didnt fully fit my needs. So i did my own framework and felt quite happy with the results. But something was wrong. The feedback-loop took too long and i couldnt manage to seamlessly align the workflow between maven and eclipse. I had to jump between the shell and eclipse to compile, deploy and execute my tests, and this was a minor disaster to my productivity.

And so RedHat came out with Arquillian (last year i think?), the missing piece of testing JEE applications… for real. It supports most of the major open source servers, decouple tests from the server, allows for rapid feedback-loop (using only eclipse) and tests can execute remotely on a production-like software stack. It is everything i wanted my framework to be (and more) and fits my needs exactly!

I did have trouble figuring out one thing though.

My development environment needs to be unaware of where the actual machine running the tests is located. The machine could be localhost, somewhere on my network or on a virtual machine on amazon etc. I do not want to care where it is and the environment need to figure this out by itself. Arquillian needs the address of the remote machine to be defined in arquillian.xml, assumingly tracked by my version control system. I need something more dynamic.

However, it turns out I was wrong. Arquillian can handle this aswell. I havent tried it out yet, but it looks promising.

Thank you for a great project!

Categories: coding, Java EE, testing Tags:

Pipe your data to /dev/null

Mongodb is web scale. A bit old, but sooo funny!! :-D

You read the latest post on HighScalability.com and think you are a f*cking Google architect and parrot slogans like Web Scale and Sharding but you have no idea what the f*ck you are talking about.

Categories: scalability Tags:

Feature branching

I couldnt agree more to what Mike Mason and Martin Fowler thinks about Feature Branching. The real semantic problem of distributed teams, working on the same codebase, does not go away just because your version control system is really good at branching and merging. Feature branching is it a poor excuse for doing careless development. Because… you are not really paying attention and respect to the work your fellow mates are doing at the same time as you are. I have personal experience of this way of working that caused me significant psychological (and physical) injury in the past.

Indeed, feature branching is almost irrelevant in stable and modular architectures since you already did separate your concerns, allowing teams to work without stepping on each others toes. Modularity also makes it easier for your interfaces to evolve in a backward compatible way.

So hearing voices whispering “integrate, integrate, integrate” is *actually* a good thing. And if integration causes you pain, sticking your head in the sand and branching off is not what you should be doing. Chances are that you have bigger problems you should be dealing with.

Categories: principles Tags:

The Best Things in Life Are Open

Ahh! Sweet taste of freedom!

Every step taken towards a working environment for producing software in the open feels surprisingly good. Knowing that your effort is not dictated by a project budget ruled by thick hierarchies of management (sometimes ignoring the culture and principles of software developers), is a relief. This may be a manifestation of my own frustration not knowing how to work the organization.

But then again, giving something back to the community is what matters and i soon have the minimum technical infrastructure to get started with the task at hand:

  • blog

    WordPress have support for most features I could wish for and also a huge community around plugins and themes. I needed a neat and pretty way of posting code, so I installed a code syntax highlighter plugin. A friend of mine, Kristian Olsson, was kind enough to let me host my blog on his server. Props!

  • license

    The license has been discussed in previous posts.

  • version control

    Two alternatives, Git or Mercurial. The King Penguin convinced me of git. Sourceforge is close to heart but GitHub won in the end. In my opinion GitHub has cleaner wiki and bug tracking. I wish for more elaborate code review support, but the Fork + Pull Model will do for now.

  • bug tracking

    I like Jira alot, but its not free. Ticking off issues at commit-time is essential, linking commits to issues, categorization and miletones aswell. All of this is provided by GitHub. One thing i havent investigated is the possibility for automatically linking mailing lists posts to bug comments. I really hope it is possible.

  • wiki

    GitHub have Wiki support. It seems pretty naive to me, but the project does not need a sophisticated Wiki at the moment.

  • mailing lists

    Sharing information, ideas and discussions are the most important goals of open-config. Every contributor of this project are allowed (and obliged to) expressing their opinions about open-config in the open, with decisions taken using consensus-based democracy.

    I have used Nabble in the past and liked it alot. But I also tried out groups.google.com, which is both a forum and mailing list and it just *worked* so i’ll use it for now.

    user mailing list: open-config-user@googlegroups.com
    dev mailing list: open-config-dev@googlegroups.com

I have backup and restore on lock for everything i produce from this point on, including possibility to migrate to other providers if needed.

I also need to ensure that the work I do on this project does not conflict with policies of current employer. If i am in the clear, the next blog post will hopefully clarify the mission statement of open-config.

Take-off is close… :-)

Categories: open-config Tags:

LGPL vs ASL

Ok, so i did some reading on licensing and it is… actually technically interesting. My goal is to have a business friendly license, which for me (considering my limited research) boiled down to a decision between Apache License, Version 2.0 and LGPL. Both of these seem to be considered friendly for commercial purposes.

What got me a bit worried is the discussion around LGPL dynamic linking in Java. Dynamic linking occurs in runtime, for example, when you have two *physically* separate class files that interact with each other (using the ‘import’ keyword) and the JVM hooks them together in at runtime.

Anyway, if an LGPL library jar is distributed (zipped-n-shipped jar) and dynamically linked with a company’s application, the company must allow reverse engineering of their source code (which basically is the same as shipping the source code to the user). But not only that, the company must also allow the user changing version of the LGPL library jar. However, David Turner of FSF state that the user is responsible for making these changes work. This *seems* fine in terms of technical support, but really, i have no idea what it means in court.

Hmmm.. There is a lot of bureaucracy and opinions on the subject that I do not have clue about and certainly not time to investigate..

So to avoid potential headaches, my decision falls on the more liberal approach of ASL since it allow commercial closed-source companies to distribute their software under license of their choice.

Categories: licensing, open-config Tags: