All-powerful Java code snippet

26 02 2009

As a joke (but not really), I was terribly annoyed at Java today, and wrote this Java code in a few minutes.  It’s (not surprisingly) capable of accomplishing *anything*.  Give it a shot.

ProjectAttempt project = new Project();
try {
    attempt = project.codeWith(Java);
    if(   attempt.results == tooRigid
          attempt.developmentTime > acceptableThreshold) {
        // since the above condition is usually true,
        // we should expect this next line to happen frequently

        throw ExceptionallyNotGoodEnoughException;
} catch (ExceptionallyNotGoodEnoughException e) {
    System.out.println(" ERROR: Java is too rigid!");
    System.out.println("1 a: deficient in or devoid of " +
        "flexibility <rigid price controls> <a rigid bar of metal> ");
    System.out.println(" b: appearing stiff and unyielding <his face rigid with pain>");
    // Don't bother with another try/catch block,
    // since it's impossible for the next line to fail.

    attempt = project.codeWith(Python);
    // Should always read "Impressive"
    System.out.println("Development time: " + attempt.developmentTime);

Smart-alec replies are welcome, but you won’t change my opinion on the matter 🙂

Modern Programming: Leaving Java – Part 2

16 02 2009

So, part 1 was all about our current circumstance in the world, with Java being a bit “dusty” (to use the term I used in the part 1 post) when held in comparison to the ideas backing new languages.

To recap the most important idea from part 1:  Java was a great step forward for its time, but there is of course the possibility of a better idea than Java.

Was the original brick cell phone the end-all solution for cell phones?  Heck no.  Was Compaq’s first laptop the best design, just because it was the first of its kind?  Absolutely not.

Just so, C and Java are not necessarily the best end-all solution, no matter how good you feel either of those languages is.

As I pointed out in part 1, there is a good legitimate use for (almost) every language, regardless of anything I say against any language.  Thus, Java will never *actually* disappear, just like C will never actually disappear.  I’m okay with that, but I wish to bring a better understanding to those of you who code in Java who would like to crucify people like me for suggesting that Java isn’t the best solution.

Part 2: Something of a Design Flaw

We all hate that diabolical moment when the sudden realization hits us: there’s a design flaw, and a project redesign is needed in order to accomplish the necessary task.  And it’s either a redesign, or you’ll be forced into more countless hours of trying to code your way around the design flaw, thus the code becomes convoluted, and making it impossible for the next guy to easily grasp what’s going on.

Part 1 was clean of any significant reference to other languages.  Part 2 is not.  You’ve been warned.  I’ve established the foundation of my case in Part 1, and so from this point of view, read the rest of Part 2  in conceptual terms.

If I get any replies on how to better code a Java example, then those persons have entirely and utterly missed the point of what is being presented (again).  If you are one of those persons, then I cry for you and your blindness to outside ideas.

Java’s core development effectively stopped the day the JVM was completed;  “That’s it, that’s all” (as the french Québecois say.. in english).  Once the JVM was made, the Java platform allowed for pretty much any package to be developed.  The JVM has all of its nasty internals tucked out of sight, and coders are free to make whatever code they want, and it’ll run on top of the JVM.  Yes, the JVM is probably tweaked frequently, but huge additions aren’t really in Java’s script (‘script’ as in playwrite).  Therefore, you have to “code around” the design in order to accomplish modern operations on Java’s data types.

This is generally the pattern of most languages: initial core development exits stage right, package development takes the spotlight.  So this being the case, we can’t really expect any overhaul to the JVM and how it works.  Therefore, I present to you concepts.

Programming is becoming more and more “object-oriented”, and rightfully so.  The goal therefore, is to make the code modular.  The goal of modular code is easy deployment/use all over the project’s code.  You change something in one main source file, and the changes are reflected everywhere.

Java (like a few other languages) has saturated (super-saturated?) itself with packages which define hundreds of specialized data structures for certain tasks.  This is great if you understand the fine nuances between their uses.  But frankly, naming two Classes “HashMap” and “HashTable” doesn’t clear anything up for me.  What’s the difference?  (Readers, don’t answer; it’s rhetorical.)

Python’s idea is a bit different: of course it provides certain data structures for you to use, but it decided to circumvent some of this madness.

So here’s the bottom line:  Java says “Pick your card”.  Python says “Write you own card”.

That’s probably a terrifying thought for you Java peoples.  “What?  No Hash generalization to plug into a [Concurrent]Map/Table/AttributeSet/Set ???!!?” And the same applies to Tree stuctures and the whole gang.

No, Python’s job is not to baffle you with finely-tuned data structures that you’ll never master.  Instead, Python’s “Write your own card” approach is used, and it’s surprisingly simple:  You’re going to write what you need to use, but we’re going to make it easy to do that.

Let’s start with something *extremely* simple.  My argument has more weight than this, but you’ll begin to understand soon.  Java did away with simple integer values casting into booleans for use in IF constructs.  Reasoning probably had to do with ambiguity between a ‘1’ and a ‘true’.  To avoid this, Java doesn’t cast ints into booleans.  Fair enough.  But then we explore the rest of how Java compares values.  To test the “value” of an variable, we’re used to using the ‘==’ sign.  The problem with dynamic object is that everything is pointers.  This means that if you ask if Object1 == Object2, you’re comparing their memory addresses which each pointers points toward.  That’s not what you want.  So by necessity, Java provides dozens of methods on each object’s base class called ‘equals’.  And so Java typically tests value by the ‘.equals()’ method.  This sort of removes the ‘==’ from its most common use!  ‘==’ is left only for FOR constructs and simple primitive tests.

Python, and a few other more forward-thinking languages, have added a reserved word called ‘is’.  This solves the issue noted above.  When you want to compare the ‘value’ of two objects, you use ‘==’, but when you want to test the ‘identity’ of two objects, you use the word ‘is’.

The point here is that if you say Object1==Object2, you’re comparing values (the most common case, you might argue), but when you want to ask if Object1 literally IS Object2 (ie, their pointers are the same), Python just makes you compare as Object1 is Object2.

Example follows. Note that in Python, examples are provided with “>>>” prefixing lines that you would type in an interactive console session. Lines without the “>>>” are things returned to you by the Python Interpreter.
# Python
>>> myInt = 1
>>> myBoolean = True
>>> print myInt == myBoolean
>>> print myInt is True

Very simply, the ‘==’ operator is used to ask the integer and boolean values if they “boil down” to the same thing. This is True. But if you ask the integer if it IS the boolean value True, then it returns False, because an integer simple ‘is’ not a boolean.

This is the kind of new stuff that Python (and other languages) are offering to the world. Java is sort of half stuck in this world of mixing the meaning of the ‘==’. If used with primitives, you test by value. If you test with any object at all, you test by memory address, and are thus forced to use Object1.equals(Object2) to test value.

Integrating custom objects in Python into standard Python syntax is amazingly easy.  By contrast, Java rather frequently forces you into writing goofy class methods to perform your tests.

Consider this example of array indexing.  The following seems natural:
// Java
String[] myArray = new String[]{"1", "2", "3"};
String myVar = myArray[0]

But if you write your own object, you can’t ever use indexing notation, to get or set items as the indexed position.  You have to use custom class methods instead.  But in Python, you can integrate with the language by defining methods on your object.  Thus, I can actually tell Python what myCustomObject[0] or even myCustomObject[“sectionName”] means:
# Python
>>> myCustomObject["mainSection"] = "hello world"
>>> print myCustomObject["mainSection"]
'hello world'

In effect, you can have your custom object manage the values however you want; that is up to you.  It abstracts the implementation so that if it ever changes, code other people have written won’t deprecate so easily, because they access the data via standard Python syntax.



Further, you can actually iterate over a custom object in your own custom way.  You can pick what “iterating over” the object means.  You have complete power. Assuming you had several “positions” in your custom object, example code might look like the following:
# Python
>>> for i in myCustomObject:
...     print i, ":", myCustomObject[i]
'mainSection' : 'hello world'
'another' : 'foobars are healthy'

Can you see the power in this?  You have the amazing ability to tell the language how to integrate very simply with the standard language syntax.



Lots of “advanced” object functionality is available to Python’s objects by default.  There’s no need to dig out obscure utility functions to get efficient functionality.  A popular thing about arrays these days are “array slices”.  Simply, this means that you can access several indexes of an array at a time. In Python, your typical idea of an ‘array’ is implemented as a data type called a “list”. This is built into Python automatically. There’s no need to import any goofy package first.
# Python
>>> myList = ["a", "b", "c", "d"]
>>> myList[0]
>>> myList[0:3]
['a', 'b', 'c']
# You can also imply the beginning or end of the list:
>>> myList[:3]
['a', 'b', 'c']
>>> myList[1:]
['b', 'c', 'd']
# Or, you can go backwards:
>>> myList[-1]

See how easy this is?  Say goodbye to excessive myArray[myArray.length – 1] calls.  Lists support pushing, popping, removing specific items, and queries to find out if an element exists in the list:
# Python
>>> myList = ["a", "b", "c", "d"]
>>> myList.append('e')
>>> myList
['a', 'b', 'c', 'd', 'e']
>>> poppedValue = myList.pop()
>>> poppedValue
# Pop from an index:
>>> poppedValue = myList.pop(0)
>>> poppedValue
# Append lists together:
>>> myList.extend(['1', '2', '3'])
>>> myList
['b', 'c', 'd', '1', '2', '3']
>>> "b" in myList
>>> "z" not in myList
# As a side note, you can do the same to any iterable object, including strings or custom objects!
>>> myString = "12345"
>>> myString[2:4]
>>> myCustomObject[1:]
['foobars are healthy']

Much more is possible, but this is good enough to start with.  Just understand how amazing simply Python has made this.  Java is a nightmare compared to this.  It might all seem trivial to you, but Python has decided what the most effective way to perform these operations is simply built in.  Anybody caught using any loops in Python to do these things should be arrested.  Anybody caught using loops in Java to do these things is doing it because they have no choice.



These examples only touch the surface of the differences.  These differences are exactly what makes modern programming languages so powerful.  Yes, there’s a speed trade off, but the productivity to be gained is incredible.  The “scaffolding” of Java and C is overdone.  It is not as needed as it once was.  Python’s clean appearance is easy to read, easy to maintain, and easy to extend.  Python is fully object oriented, and supports multiple class inheritance.  Complex tasks are easier, and you’ll find that you don’t need to code with your browser open to the current JavaDocs.  Python’s just easier.  I’d love to expound on more features of Python which further enhance productivity, so I’ll probably write a Part 3, for those curious to have it explained.

The point is not comparing speeds.  The charts often prove that Java can outrun Python in many tasks.  But as you look at the code to power the Java part of the test, and then compare it to the Python equivalent, you’ll be shocked at how little effort is required to achieve the same behavior.  And with those clock cycles lost in the Python code are more than made up in productivity, allowing you to simply accomplish MORE.  Certain tasks will still require C, regardless of how good Java is.  Nobody seems to complain between the slowness of Java compared to C.  Thus, it’s not a big step to move on to Python.  Modernize.  Get more done.

Until part 3.

Why Java is stupid (and should be deprecated, haha..)

12 02 2009

If Java and a language like Python got in a fight, Python would eat the eyeballs of Java, and then spit them out at jobless teenie-bopper Java coders.

Blinded by the magical cross-platform abilities Java presents to its audience, Java coders are hard at work being hard at work.  Meanwhile, back at the ranch, Python coders are producing things that perform actual tasks.

(Keep in mind that in the first few comments below, people have bashed on the idea that I’m comparing APIs.  But in fact, I’m stating that Java is conceptually flawed and thus its API is inferior, because it doesn’t have a JAVA api.  It only has its PACKAGE apis, which only invokes more inefficient java byte codes, which STILL must be put through the JVM.  If Java itself had an API for internal tasks, this post would have no relevence.)

A comparison is in order:

The Java philosophy states rather boldly that “less is more”, in the sense that programming shorthand creates troubles for programmers, because they can’t say for sure what is happening behind the scenes, and thus creates confusion.

The Python philosophy is radically the opposite: Shorthand improves readability (so long as the shorthand is intelligently founded on useful concepts) and thus boosts productivity.  This allows you to simply get mundane tasks out of the way, so that you can quickly accomplish your task.  Others who look at the code can easily understand what the shorthand does, so long as they have an interest in understanding the language.

A simple universal illustration should easily demonstrate the concept.  Here is an agreeably poorly coded loop, followed by the revolutionary shorthand:

int ctr = 0;
int anotherVar = 100;
while (true) {
   // doingStuff
   if (ctr == anotherVar)
for (int i = 0; i < max; i++) {
// doingStuff

The reason why this is poorly coded is merely derived from the fact that you can’t tell what is happening by glancing at it.  This is a fact.  The author of the code may know what it does, but the rest of us need to read it. This is why the for loop came into existance.

Nobody will take the stance that the for loop intrudes on namespace and creates confusion.  It helps make code easier to read.  If you don’t agree, you need emotional help.

Now, by sharp contrast, consider this bit of Java code, which demonstrates how to count how many times a substring appears in a String object.  I’ve written it out in three (very) different ways, neither of which is a productive use of my time to write:
// 1
int count = 0;
for (int fromIndex = 0; fromIndex > -1; count++)
   fromIndex = text.indexOf(search, fromIndex + ((count>0) ? 1: 0));
count -= 1;

// 2
import java.util.regex.*;
Matcher m = Pattern.compile(search).matcher(text);
int count;
for (count = 0; m.find(); count++);

// 3
int count = text.split(search).length

#1 is pretty ghetto, because it has a for statement which mixes meanings, and doesn’t really illustrate the traditional use of the for statement.  It’s effective, but burdened by the fact that Java won’t cast boolean values into integers (hence the ternary operator in there).

#2 is more concise, but requires the regex package, just to count flipping substrings.  The very non-traditional for statement is easier to read than that in #1, but still kind of goofy.

The only method that comes CLOSE to being efficient is #3.  As a bonus, #3 can use regex without having to import the blasted regex package.  But the fact still remains that it doesn’t scream at you that it’s counting a substring.  I haven’t benchmarked, so I don’t know how efficient it is with resources.  The split method on the String class is specific to the String class itself, and thus can’t be used with other String variants, like StringBuffer or StringBuilder.

Now let’s consider some Python code.  But first, if you didn’t know the Python language, what would you guess the name of the function is?  An intuitive guess would include the possibility of it having the word “count” in it somewhere.
# Python
count = text.count(search)

What?  We’re already done?  Example concluded.  Nobody can argue that this is hard to understand, or that it pollutes namespace, since the “count” method is one that is called on the string object itself.  Naming a variable “count” (as I in fact did in the example) does not create any confusion or ambiguity for the rest of the code.


If you want to multiply a String in Java, you can’t do it effectively.  You much make a for loop to iterate X number of times, appending the source string to a new StringBuilder until you’re done.  What would possess me to do that?  That oozes with inefficiency, and begs a newbie to do it poorly.

Python (and many other interpreted languages) have simply overloaded the multipication operator, so that “x” * 5 simply equals “xxxxx”.  This isn’t a terribly frequented piece of code, but the mere fact that Java can’t do it as effectively as most other languages suggests something.  For a Django admin application I have written, the web page renders a tree-structure in a table form.  In other words, table rows are indented five spaces per nested level, to give the traditional file-tree effect.  If I had written this in Java, my best mustered attempt would be an inefficient static utility method to do the same.  Heaven knows I don’t want to subclass every String in my project, so so that I can multiply my Strings!

No, the solution lies in simply multipication overloading.  Indenting my table rows is as simple as five spaces multiplied by the nested level:
indent = "     " * nested_level() 

Good. Clean. Code.

So then, how would a novice go about creating something like the PHP ‘implode’ function, which takes an array of strings, and turns it into one big string with a specific delimiter between each array position?  (Quick example:  )

{“1″,”2″,”3”} imploded with “, ”    ==    “1, 2, 3”

Note that the 3 is not followed by another “, “.  That is the purpose of the function, and it is amazing easy to understand. Observe the primitive method that Java must resort to, while Python laughs at the sideline:

// Java
import java.lang.StringBuilder;
StringBuilder result = new StringBuilder();
String[] myArray = {"1", "2", "3"};
for (int i = 0; i < myArray.length; i++) {
   result.append(", ");
String implodedString = result.toString();
# Python
implodedString = ", ".join(["1", "2", "3"])

Java needs to go to its room.

There is nothing effective about forcing (7 + 4*myArray.length) java commands into the Java Virtual Machine.  That’s each static statement, plus the two lines inside the for statement per iteration, the ‘i++’ part of the for statement each iteration, plus the condition evaluation each iteration.  This is horrendous.

Python understands that things are simply easier and faster if it stays on the internal side of the code; if Python can keep its tasks inside of its C-coded backend, things are going to be much faster, and easier to read.  If Java could do the same, it would be a better language.  But the fact is that it *can’t* do that, because the code you are writing is just as deep as the set of packages you import.  Everything has to pass through the Java Virtual Machine.  The Java Virtual Machine has no functions/methods of its own in order to speed things up (in terms of clocked speed and productivity).

Java’s motto is effectively:  Java Can’t.  And it’s that simple.  It’s founded on a design flaw, which bases the language off of pseudo-low level code.  This is a silly notion, since Java is interpretted.  If there is an effective way of doing things, and it’s oppositely ineffective way, Java leaves you in the middle of the ocean to try your luck.  And if you show your code to someone else, they’ll inevitably have a better, faster, harder to read solution.

On the other hand, Python provides you with the effective solution.  You don’t have to know the exact algorithm to Python’s “join” function (but if you wanted to know how it does it, you could open up the source and look).  Python has witnessed Java’s blunder, and has improved upon the design.  Python’s equivalent to Java’s Virtual Machine is actually intelligent, by comparison.  The Python interpretter is code written in C.  The C code is where the power of the built-in functions lies.  Python can then execute your scripts at far better efficiency, and look good doing it,

and Java Can’t.

Understanding Python’s ‘super’

7 02 2009

When I started programming with Python (a switch I would urge any casual utility programmer to make), I learned basic syntax. I very very quickly learned that if you’re ever wondering if there’s the proverbial “better way” to do something, then there is probably already a utility built into Python to accomplish the task.

Consider the ‘super’ functionality in Python: when I first encountered this, I had no idea what it was or what it meant. Furthermore, the documentation on it is very obscure and doesn’t seem to clarify anything about what it does. I Google’d it (of course) and instantly found results with titles like “Python’s Super Considered Harmful” and “Problems with the new super()“. Encouraging.

It wasn’t until I began working in Django, a pure Python-based web framework (which I would *highly* recommend, if you’re in the web business: ), when I was reading over others’ source code that I realized what you could do with super.

If you have a class A,which defines some method called routine, like so:
class A(object):
    def routine(self):
        print "A.routine()"

And then a class B, which inherits from A, which *also* defines a routine method:
class B(A):
    def routine(self):
     print "B.routine()"

We instantiate with this:
>>> b = B()

For simplicity, this example is stupidly simple. Now, if you instantiate B, and call it’s only method, you’ll only ever be running B‘s routine method, and no matter many times you call it it’ll never give you that of class A.

Now consider if class B had the added a line to the end of it’s method:
class B(A):
    def routine(self):
        print "B.routine()"
        super(B, self).routine()

In this case, ‘super’ will return the A object which is underneath B.  You give super() a handle to its own class, and then an actual instance of that same class.  Hence, we gave it “B”, and then “self”.  Super returns the literal parent of the active B object (which is the local variable ‘b’, because we passed it ‘self’).  It is not returning the simple generic class; instead, it is returning the actual A which was created when the local variable b was created.  This is called a “bound” class object, because it’s referring to an actual parent class object in memory, instead of just the class blueprint.

This is what happens when we create a new B object now:
>>> b = B()
>>> b.routine()

Simply put, this kind of usage of the super method is often used to “pass control up” to the parent class, after the subclass intercepts data.

Finally, if you’re interested, here is a more practical example:
from some.package import A
# Note here that we don't know anything about the inner workings of A
# except that it has some method called "render" which takes lots of
# arguments.  The only argument that we know about is 'foo'.
# The goal is to make our own class to replace A, so that we can
# do something to the data, and then gives control back to A, so
# that program flow is uninterrupted, and so that we don't have to
# ever know how A actually works.

class myClass(A):
    def render(self, foo, *args, **kwargs):
        ''' this receives a var named 'foo', a tuple of
        unnamed 'args',and a dictionary of named 'kwargs' '''

        # Append a quick prefix to the variable 'foo'
        foo = "intercepted by myClass - " + foo
        super(myClass, self).render(foo, *args, **kwargs)

In this example, we don’t need to know anything about class A, except for the fact that we want to alter the variable ‘foo’ when it comes into A‘s render method.  Note that ‘*args’ catches any unnamed arguments passed to myClass, and that ‘**kwargs’ is the common abbreviation for ‘key-word arguments’.

Also note that the only reason why myClass‘s render method ALSO takes bunches of arguments is because we model it to look exactly like A‘s render method.  We want myClass to seemlessly integrate with some other code.  That other code should never have a reason know the difference between A and myClass.

All this does is change ‘foo’, and then passes control back up to the parent class A, where the data was intended to go.  We cleanly call the super method, which returns A, with all of its unknown methods and fields.  We then call ‘render‘ on that returned object, in order to execute A‘s own render method (and not our overloaded one in myClass).

By passing A its arguments with those prefixing * characters, we preserve how they were passed into myClass.  Keyword arguments get turned into a dictionary while in myClass.render, but A.render wants them as keyword arguments still, not a dictionary.  So, we use the dereferencing * characters to turn it back into keyword arguments.

Clean, huh?  This is extremely common in Django code, because Django gives you base classes to model from.  You then have the power to easily overload those model methods, do some custom task, and then pass control back up to the model’s method for the intended behavior.

While super is nice, it only resolves into a single parent class, such that multiple inheritance (where multiple parent classes have the same method name) won’t know how to decide between which method to run.  Instead, you can directly invoke the parent class’s method in a more manual manner, such as “SecondParentClass.render(self, foo, *args, **kwargs)”.  Note that you pass a reference to ‘self’ in that method call, to properly put things into scope.

Python 2.6 & MySQL

4 02 2009

For any of you who both run Windows and use MySQL as your database backend, you may have found MySQLdb already.  You may also notice that there is no release for Python 2.6.x on the page.  That struck me as odd, since Python has moved into the 2.6 days near the end of 2008, and then shortly thereafter, they announced and release Python 3.  Why hasn’t MySQLdb caught up?

So I’m stuck with Python 2.5.4?  you say.  Not quite– If you run Linux, just compile the MySQLdb source on your 2.6 version of Python, or if you run windows (and heaven knows it’s hell trying to compile anything from source on a windows machine) then you should just download this obscure 2.6 release of MySQLdb.  The only reason I know about it is because I’m on the mod_python mailing list, and someone had a question about the missing 2.6 version.

So that you’re properly warned, Python 2.6 deprecated the Sets module from ImmutableSets, which gives a warning when MySQLdb gets fired up.  MySQLdb works just fine still, but it’ll likely need some work to become compatible with Python 3.

Python, MySQL, and a dictionary

19 12 2008

Here is a line in python that should be recorded for posterity…

Some background.  I have a multilingual database where several tables exist.  Each table is named after a particular “vocabulary” for my website.  For instance, there’s a “forums” table for all the terms that are used for pages related to the forums.  The purpose of having this table is so that I have all the needed terms very isolated into a nice hierarchy.  The SELECT * sql would look like this.. (pardon the fake spanish translations)

mysql> SELECT * FROM forum;
| ID                  | enUS                       | es                           |
| Audio Review        | Audio Review               | [Audio Review]               |
| by                  | by                         | [by]                         |
| complete            | completed                  | [completed]                  |
| current             | current                    | [current projects]           |
| dropped             | abandoned                  | [dropped]                    |
| loading             | Loading...                 | [Loading...]                 |
| no                  | no                         | no                           |

As you can see, there is a column “ID”, which is what I use site-wide to access a term, and I select only one other column, according to the language code the user has his/her preference set for.  Thus, a general select for such a thing is …

mysql> SELECT ID,enUS FROM forum;
| ID                  | enUS                       |
| Audio Review        | Audio Review               |
| by                  | by                         |

… etc, etc.  Then, I just used PHP to fetch these results into an associative array.  I would then have to loop through each record, saving whatever is in the $row[‘ID’] spot as a key to this array, and whatever is in the $row[currentLanguage] spot as the value to said key.

So anyway..  I was trying to achive this effect in python; a Dictionary-type object where I can simply index into it with the term that I want, in order to get the language-translated value represented by the term.  I knew there had to be a way to do a famous python one-liner, and since those one-liners typically clock much faster than manual loops..

resultSet = (map(None, row.values()) for row in rows)

Seems simple, but it’s a hassle to work it out when your data is a tuple of dictionaries 😛

Java: Array Slices

14 11 2008

First off, let me start by saying that I don’t really like Java.  We’re working from that assumption, understand? 🙂 

That being said, I have to work in it every day at ekiwi while we develope web “scrapes” for clients.  Our screen-scraper software can utilize Interpreted Java (via Beanshell), Python, JavaScript, VBscript, Jscript, and probably a few more that I’m forgetting to mention.  However, to maintain a standard, we write our scraping scripts in Java.

Given the fact that I think Java is rediculously too verbose, I was happy to find a way to at least partially get the effect of PHP or Python array slices.  Here it goes. Since the JDK 1.2, you can use the following code.

The senario is that we have one array, and we want to cut it into 2 halves. Theoretically, you could just copy the first half, or the second half, or any segment in the middle, so long as you have indexes to the array.

// Interpreted Java
String[] myArray = new String[ 11 ];

// Get the sizes for the arrays.
int firstHalfSize = myArray.length / 2;
int secondHalfSize = myArray.length - firstHalfSize;

String[] firstHalf = new String[ firstHalfSize ];
String[] secondHalf = new String[ secondHalfSize ];

// Arguments are: sourceArray, sourceStartIndex, destinationArray, destinationStartIndex, numElementsToCopy
System.arraycopy( myArray, 0, firstHalf, 0, firstHalfSize );
System.arraycopy( myArray, firstHalfSize, secondHalf, 0, secondHalfSize );

The “System.arraycopy” method is very efficient; it’ll beat any loop you can write in your own code.  (This is precisely why forward-thinking sripting languages like Python build this functionality into arrays from the start with a syntax like “myArray[ startIndex : length ]” (the colon is literal).)

So what does this mean?  Use it.  Bring happiness to those people who wind up using Java by speeding up an overly-verbose interpreted language.