Gabriel's Musings

March 25, 2009

Things I hate about Python/Numpy/Scipy

Filed under: Computers, General, Python — ggellner @ 9:28 pm

Well recently reading the great post
http://use.perl.org/~brian_d_foy/journal/32556?from=rss has inspired me to get
off my chest things about python/numpy/scipy that I hate. (With the
understanding that I love Python for science, and use it almost exclusively.
But if I had a million dollars and infinite influence I would fix these).

  1. Keyword arguments are parsed in as a dictionary, and are therefore unordered.
    This makes it a constant paint to use nice syntax to define ordered types. Say
    for example I want a list like object with element names (a la R) in a perfect
    world I would be able to say NList(a=1, b=2, c=3) and expect the order of the
    resulting list to mimic the order of the keyword definitions (again like R
    does). Instead I need to use NList((‘a’, 1), (‘b’, 2), (‘c’, 3)) which is
    plain ugly, and pedantic. Worse in the new OrderedDict type of python 2.7/3.1
    you can use keyword arguments and get arbitrary order! This seems like a big
    interface wart, it sould not have been allowed.
  2. NumPy/SciPy’s interfaces are painfully inconsitent. At best they mimic
    Matlab, which is a bad idea as Matlab’s argument unpacking is more powerfull
    than Python’s, and with objects it is suboptimal in any case. Mathematica and
    R/S++ are much better models for what a powerfull programming language means
    for scientific interfaces.
  3. The BSD license requirement for NumPy/SciPy. I understand the arguments for
    this, but I find it painfull to watch libraries being rewritten from scratch
    when amazing GPL equivalents exist. Python for science is made worse for this.
    R, Octave, and GSL are giant communities that the Python world, in my opinion,
    will waste all of its developers trying to reproduce in a BSD way.
  4. Numpy array’s have too many methods, such as std, var, cov which should be
    functions instead. An ndarray should just be a fast multidimensional object
    that can be reshaped, accessed, and operatered on. If a richer object with
    statistical functions is needed it should be a seperate subclass, as this adds
    bloat and often is not needed in the context that ndarray’s are used. Further
    the default behavior of std uses the population formula instead of the de
    facto standard of the sample formula (used by every other scientific package
    – ever). Some developers see this as a needed revolution, but as a method to
    an extension type it can not be easily overrided (without a serious slowdown
    in all method calls) which makes for the annoying code of .std(ddof=1) all the
    time, unless you breakdown and use use the (which would require explaination
    in almost any paper) or warn people to not use the method!.
  5. Finally a minor gripe that I covet from R, but don’t think can be added in
    a pythonic manner is the ability to use values in default arguments from other
    default arguments. That is something like size(n=10, step=n/2). In python we
    would need to use extensive None defaults and handle the logic in the function
    which removes documentation.

Well that feels better.

Advertisement

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Theme: Shocking Blue Green. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.