Tuesday, December 15, 2009

SciPy India 2009 update

Today is the fourth day of the 2009 SciPy India conference.  Although the first SciPy conference in the US was held in 2002, 2008 was the first time the conference was held in Europe and this year is the first time the conference was held in India.  It is a sign of the growing interest in using Python for scientific computing that there are now three annual conferences.

During the SciPy 2009 conference in August, Prabhu Ramachandran spoke about the Free and Open source Software for Science and Engineering Education (FOSSEE) project he was running at IIT Bombay.  The FOSSEE project is an ambitious project to promote the use of Python in numerical computing in college curriculum.  Prabhu has an interesting post on the contributions the scientific Python community has made to the larger Python community.  FOSSEE is actually just one part of an even more ambitious $1 billion dollar (US) government program called the National Mission on Education through Information and Communication Technology.

Starting at the end of May 2009, Prabhu very quickly gathered together an amazing team that immediately created a significant amount of documentation and training materials including tutorials, audio/video demonstrations, written material, and lectures.  They've created a great two-day hands-on introductory tutorial to scientific programming with Python and have all ready conducted several of these tutorials all across India.  Now they are working on creating a couple of semester long college courses and will be offering the first one next semester at IIT Bombay. 

At the end of the SciPy 2009 conference in August, Prabhu proposed that we put together a SciPy conference in India and I immediately agreed.  Not wanting to delay, we decided to have the conference before the end of the year.  After all putting together an international scientific conference in less than four months was keeping with the overall ambition of the FOSSEE project.  As soon as Prabhu returned to Mumbai, he contacted Vimal Josef at SPACE Kerala about hosting the conference in Thiruvananthapuram.  Shortly after that we announced the first international on Scientific Computing with Python (Scipy.in 09) from December 12th to the 17th at Technopark, Thiruvananthapuram sponsored by FOSSEE, IIT Bombay and SPACE Kerala.

Once we finalized the dates for the conference, I called Travis Oliphant, the president of Enthought, and asked him to deliver the keynote address, which he quickly agreed to do.  Among his many accomplishments, Travis is one of the original authors of SciPy and the primary developer of NumPy.  David Cournapeau (one of the core NumPy and SciPy developers) and Chris Burns (one of the core developers of the neuroimaging in Python project) also agreed to deliver invited talks.

The FOSSEE and SPACE teams were invaluable in organizing the conference.  In particular, Madhusudan.C.S from the FOSSEE team worked very closely with me on the conference website and putting together the conference program.  I will write another blog post in the next day or so with a description of the actual conference.  For now, you can read a short write-up from one of the local newspapers.

Sunday, November 29, 2009

Sunday in Paris

I spent most of today working on the SciPy 2009 proceedings with Gael and catching up on sleep and email.  For dinner, Gael, Emmanuelle, and I meet Jean-Baptiste Poline at the Denfert-Rochereau station and found a very traditional french wine bar called Au Vin des Rues, which is on rue Boulard just off of rue Daguerre and was open on a Sunday evening.  (I had a delicious slow-roasted leg of lamb with potatoes au gratin and rum baba for dessert.)  The rue Daguerre has a wonderful pedestrian street market I often seem to visit when I am in Paris.  Here is a picture of looking down the rue Daguerre (the street market is closed, of course) toward rue Boulard (you can see JB just right of center with Gael peeking over Emmanuelle's shoulder):

PJ Toussaint, who just flew back from a conference in Greece, joined us just in time for dessert.

Saturday in Paris

Just thought I'd try to keep a little journal of my trip.  We'll see how long this lasts.  Anyway, I landed at Charles de Gaulle at about 6:30am on Saturday morning and took the RER to Bourg-la-Reine to stay with Gael and Emmanuelle.  Here is the entrance to where I am staying:

And the view from my window:

Once I arrived I took a short nap and then went out with Gael to hunt and gather in the market above the passage he lives on.  Here are a two pictures I took at the local market (they had all kinds of things, but cheese and meat are, of course, the things that attracted me most):

We also went into a store called frozen food store called Picard:

After lunch, Gael and I headed to Paris.  Over the last five years, I've tried to visit the catacombs numerous times.  Unfortunately, every time I've visited, they've been closed.  This time turned out to be no different.  The picture on the left is Gael standing in front of the entrance to the catacombs (you can see a sign on the door, which states that they will be closed for the next month) and the picture on the right is of the road behind me:

Since I couldn't visit the catacombs, we decided to head to the Montmarte district to walk around for the afternoon.  Here are a couple of pictures of the Sacre Coeur at the summit of Montmarte:

I forgot to take pictures for the rest of the day, but after Montmarte we headed back to the Latin Quarter to spend a couple hours talking about the SciPy proceedings (which we hope to finish today) in a cafe with wireless internet.  And we grabbed an early dinner at 8pm with Emmanuelle and a couple of colleagues from Neurospin.

Wednesday, November 18, 2009

NumPy 1.4 coming soon!

Nearly eight months after NumPy 1.3, NumPy 1.4 will be released well before the holidays.  This release comes with the usual raft of bug fixes, performance improvements, new features, and improved documentation.

Our web-based documentation editing system continues to be a great asset.  In just over a year, this system has helped us to vastly expand and improve our documentation.  For instance, our reference guide has gone from under 10,000 words to over 110,000.  When Guido came to visit Berkeley a few weeks ago, Fernando Pérez showed him the web-based documentation editor and he was very impressed and even commented that it would be nice for the standard library to use a system like this.

David Cournapeau is again serving as release manager and he is also responsible for much of the code improvements for this release.  I just noticed the other day, that according to ohloh, David is quickly approaching Travis Oliphant's number of commits.  While it is hard to attach any specific meaning to this statistic, it is clear that at this point David is one the most significant contributors to NumPy.  Among his many contributions to this release, he reduced numpy's import time by 20%-30% by adding a small numpy.lib.inspect module and using it instead of the upstream inspect module.  Another very useful improvement by David is that you can now link against the core math library in an extension.

In addition to all the work David's done for the 1.4 release, some of his recent work won't be included until the 1.5 release.  Once David branches for 1.4, he has all ready promised to merge his Python 3 support for numpy.distutils into the trunk.  While we are just beginning to plan migrating to Python 3, this is an important early step.

Unfortunately I am not sure the new datetime dtype support for dealing with dates in arrays will be included in the 1.4 release.  This useful functionality was developed over the summer by Travis Oliphant and Marty Fuhry.  Marty was my Google Summer of Code student; although, I was pretty busy so Pierre Gerard-Marchant did most of the day-to-day mentoring.  Despite the fact that this code was merged with the trunk at the end of the summer, there is a reasonable chance that it will be pulled before the 1.4 release due to the lack of documentation for the public C API.

I've only touched on a few of the many improvements you can expect to see with NumPy 1.4.  For more details about the upcoming release, please see the release notes.  Thanks to everyone who worked on this release and to David in particular.

Friday, November 13, 2009

SciPy 2010 coming to Austin, TX (6/28 - 7/4)

Mark your calendar!  The 2010 SciPy Conference will be held in Austin, Texas from Monday, June 28th to Sunday, July 4th.  We are still in the early planning stages, but expect to have two days of tutorials, two days of conference, and three days of sprints:

  Tutorials        Monday (6/28) - Tuesday (6/29)
  Conference    Wednesday (6/30) - Thursday (7/1)
  Sprints           Friday (7/2) - Sunday (7/4)

From 2001 to 2009, the SciPy conference was held at Caltech with the generous support of the Center for Advanced Computing Research.  This year we've decided start holding the conference at rotating locations to ensure more people will get the opportunity to attend the conference.  Since the initial conference in 2001, the scientific computing in Python community has rapidly grown.  In addition to the main SciPy conference, we now have dedicated SciPy conferences in both Europe and India.  Moreover, there are an increasing number of non-SciPy conferences with dedicated sessions on scientific computing with Python.  For example, the SIAM Conference on Computational Science and Engineering (CSE09) had a three-part minisymposium on Python for Scientific Computing.

This year we will also hold the conference earlier in the summer than in previous years--moving from the end of August to the end of June.  Holding the conference at the end of August meant that the conference coincided with the start of the spring semester for many attendees, which caused some scheduling conflicts.  Having the conference in late August also made it difficult to get a quick turn around on publishing our post-conference proceedings since most of the authors and reviewers had to focus on the beginning of the academic year.

We are also extending the post-conference sprint from two days to three.  Our post-conference sprints have been increasingly successful and we hope that by extending the sprint we will be able to get a lot more done.  We are also hoping that by moving the conference to earlier in the summer will give us more time to finish work started during the sprint before the fall semester.  For example, this year David Warde-Farley and I worked on migrating the SciPy homepage from a MoinMoin wiki to a Sphinx site.  During the sprint we made significant progress on the new site, but stopped short of being able to actually deploy it.  Both David and I haven't had time to finish our work due to the academic semester starting immediately after the conference.  The changes to the conference dates will hopefully increase our success in getting bigger projects, which are started during the sprint, finalized before the start of semester.  (Work is slowly progressing on the new site and I hope to switch to the new site in December, during the SciPy India 2009 sprint.)

And if you stay for the entire sprint, you will get to be part of a Texas-sized celebration of the fourth of July.  I was able to witness the festivities myself two years ago during the 2008 Mayavi sprint.  While some people may not be able to attend the sprint because of its overlap with the fourth of July celebration, these dates were our best options given all our constraints.  This summer is going to be a very busy one for the SciPy community.  There are at least two other SciPy-related events going on during the same time as our conference.  Sage Days 22 on elliptic curves will be held in Berkeley from June 21st to July 2nd, which unfortunately means that few Sage developers will be able to attend SciPy 2010.  And the 2nd European Seminar of Coupled Problems (ESCO) will be held June 28th to July 2nd in the Czech Republic.  It will feature a track on next generation scientific computing with a focus on Python.  While it is great to see scientific computing with Python being presented in more and more venues, this particular conference means that Gaël Varoquaux will most likely not be attending the 2010 SciPy conference.  As the program chair for both SciPy 2008 and SciPy 2009, Gaël has been essential to the success of the last two SciPy conferences.

Austin is called the "Silicon Hills" due to the large number of technology corporations located there and it advertises itself as the live music capital of the world.  Almost every establishment provides live music every evening.  Restaurants offer a wide variety of interesting fare with a focus on BBQ and Mexican Food.  If you need some exercise, the river that runs through the downtown area has a very well-used running path, and the hill country surrounding Austin has many hiking and biking trails.  And most importantly, Austin is home to Enthought.  Enthought is the main sponsor of both the SciPy project and the conference.

I am looking forward to the SciPy 2010 conference and hope to see many of you there.  If you are interested in helping out with the program committee, please send me an email.  I would also like to continue and expand the student travel funding, so if you are interested in sponsoring students to attend the conference, please contact me.  I will be posting more information regarding the venue and timeline in January (after I finish organizing the SciPy India conference).

Saturday, November 7, 2009

SciPy 2009 proceedings coming soon ...

This is the second year that we are going to publish proceedings for the SciPy conference. Last year's proceedings included 17 articles ranging from discussions on recent developments in the core projects to research articles describing how our community-developed software tools were used in different fields.

Gaël Varoquaux, Stéfan van der Walt, and I are editing the proceedings this year. All accepted articles have been reviewed and sent back to the authors for revision. The deadline for the revised articles is November 14th and we plan to release the proceedings by the end of the month. The articles are looking very good and I expect that the complete proceedings will be very informative and useful for the community.

Thursday, November 5, 2009

SciPy India 2009 Call for Presentations

The SciPy India 2009 Program Committee is currently developing the conference program. We are seeking presentations from industry as well as the academic world.

We look forward to hearing your how you are using Python! For more information, please read the full call for presentations.

About the SciPy India 2009 Conference

The first SciPy India Conference will be held from December 12th to 17th, 2009 at the Technopark in Trivandrum, Kerala, India.

The theme of the conference is "Scientific Python in Action" with respect to application and teaching. We are pleased to have Travis Oliphant, the creator and lead developer of numpy as the keynote speaker.

Please register here.

Important Dates
  • Friday, Nov. 20: Abstracts Due
  • Friday, Nov. 27: Announce accepted talks, post schedule
  • Saturday-Sunday, Dec. 12-13 Conference
  • Monday-Tuesday, Dec. 14-15 Tutorials
  • Wednesday-Thursday, Dec. 16-17 Sprints

Wednesday, November 4, 2009

A visit from Guido van Rossum

We had a number of interesting visitors at Berkeley today. Guido van Rossum, the creator of Python, came to a special meeting of our Py4science group. Fernando Perez started the meeting with a 15-minute whirlwind overview of scientific computing with Python. He started by quickly presenting the basic stack of scientific software in Python, which most of us use. And he finished by highlighting some of Andrew Straw's recent work, the central role Python plays at the Space Telescope Science Institute, some of Enthought's contributions to the community, and the FOSSEE project run by Prabhu Ramachandran.

After Fernando finished his introduction, we heard nine short 4-minute lightning talks in rapid succession. This format really underscored how important Python is for so many different scientific disciplines. In addition to several presentations by faculty, staff, and students from UC Berkeley and LBL, we were fortunate to have two other visitors. William Stein spoke about Sage before running upstairs to present at the number theory seminar. And Ondřej Čertík spoke about sympy.

Following the formal presentations, Fernando facilitated an open discussion with Guido where we talked about everything from the transition to Python 3000 to the unladen swallow project. Hopefully, Fernando will post more details including links to the slides on his blog soon.

Saturday, July 11, 2009

SciPy 2009 early registration extended to July 17th

The early registration deadline for SciPy 2009 has been extended for one week to July 17, 2009. Please register by this date to take advantage of the reduced early registration rate.

About the conference

SciPy 2009, the 8th Python in Science conference, will be held from August 18-23, 2009 at Caltech in Pasadena, CA, USA. The conference starts with two days of tutorials to the scientific Python tools. There will be two tracks, one for introduction of the basic tools to beginners, and one for more advanced tools. The tutorials will be followed by two days of talks. Both days of talks will begin with a keynote address. The first day’s keynote will be given by Peter Norvig, the Director of Research at Google; while, the second keynote will be delivered by Jon Guyer, a Materials Scientist in the Thermodynamics and Kinetics Group at NIST. The program committee will select the remaining talks from submissions to our call for papers. All selected talks will be included in our conference proceedings edited by the program committee. After the talks each day we will provide several rooms for impromptu birds of a feather discussions.
Finally, the last two days of the conference will be used for a number of coding sprints on the major software projects in our community.

For the 8th consecutive year, the conference will bring together the developers and users of the open source software stack for scientific computing with Python. Attendees have the opportunity to review the available tools and how they apply to specific problems. By providing a forum for developers to share their Python expertise with the wider commercial, academic, and research communities, this conference fosters collaboration and facilitates the sharing of software components, techniques, and a vision for high level language use in scientific computing.

For further information, please visit the conference homepage:

Important Dates

Friday, July 3: Abstracts Due
Wednesday, July 15: Announce accepted talks, post schedule
Friday, July 17: Early Registration ends
Tuesday-Wednesday, August 18-19: Tutorials
Thursday-Friday, August 20-21: Conference
Saturday-Sunday, August 22-23: Sprints
Friday, September 4: Papers for proceedings due

SciPy 2009 Executive Committee

Jarrod Millman, UC Berkeley, USA (Conference Chair)
Gaël Varoquaux, INRIA Saclay, France (Program Co-Chair)
Stéfan van der Walt, University of Stellenbosch, South Africa (Program Co-Chair)
Fernando Pérez, UC Berkeley, USA (Tutorial Chair)

Saturday, March 28, 2009

NumPy 1.3 coming soon!

David Cournapeau just announced the first release candidate for NumPy 1.3. This release comes just over 6 months since the last minor release. It includes numerous bug-fixes, documentation improvements, improved Windows support, major code cleanups and refactoring, as well as several new features. NumPy 1.3 provides Python 2.6 compatibility on all supported platforms, an important first step in migrating to Python 3. This release also provides experimental Windows 64 bit support.

An important highlight of this release, is the addition of generalized universal functions. In NumPy, a universal function (or ufunc) is a wrapper that provides a common interface to mathematical functions, which operate on scalars and can be made to operate on ndarrays in an element-by-element fashion. NumPy provides a number of built-in ufuncs for mathematical operations, trigonometric functions, bit-twiddling, comparisons, etc. For example, add is a ufunc that wraps scalar addition to provide addition of ndarrays. Ufuncs support array broadcasting, type casting, as well as several other features. Generalized ufuncs extend this idea to functions on ndarrays. In other words, while ufuncs are limited to wrapping element-by-element functions, generalized ufuncs support ndarray-by-ndarray operations. There is similar functionality and terminology in PDL, the Perl vector library. There is all ready some documentation for this powerful new feature, but as always we need help improving it, which anyone can do by simply clicking here. In particular, a few examples would be very helpful.

Finally, as the 1.3 release manager, David focused a lot of attention on improving our release infrastructure (in addition to his numerous efforts with every aspect of the project). In particular, he made great progress in automating the release process. This effort should provide immediate pay off by reducing the time and effort needed to make future NumPy and (potentially) SciPy releases. In turn this will help increase the frequency with which we are able to get bug-fixes, improved documentation, speed optimizations, and new features out to the wider community.

Friday, March 27, 2009

SciPy 2009 Conference (August 18-23, 2009)

I am pleased to announce that the 8th Annual Python in Science Conference will be held August 18-23, 2009 at Caltech in Pasadena, CA.
  • Tutorials (Tuesday, August 18 - Wednesday, August 19)
  • Conference (Thursday, August 20 - Friday, August 21)
  • Sprints (Saturday, August 22 - Sunday, August 23)
This conference provides a unique opportunity to learn and affect what is happening in the realm of scientific computing with Python. Attendees have the opportunity to review the available tools and how they apply to specific problems. By providing a forum for developers to share their Python expertise with the wider commercial, academic, and research communities, this conference fosters collaboration and facilitates the sharing of software components, techniques and a vision for high level language use in scientific computing.

Wednesday, March 25, 2009

SciPy and Summer of Code 2009

Google is once again sponsoring a Summer of Code (SoC). For the past several years, the SciPy community has participated in the SoC program through the Python Software Foundation (PSF). Past projects have included a new testing framework, numerical optimization code, machine learning code, and improved support for NumPy in Cython. Most students have remained very active in the community and continue to be major contributors to the SciPy community.

To see a list of possible project ideas please see the PSF's SoC wiki. A number of students have all ready expressed interest in working on a SciPy-related project this year. If you are interested, the best place to discuss ideas is on the NumPy and SciPy developer mailing lists.

Wednesday, February 11, 2009

SciPy 0.7.0 released

I'm pleased to announce SciPy 0.7.0. SciPy is a package of tools for science and engineering for Python. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more.

This release comes sixteen months after the 0.6.0 release and contains many new features, numerous bug-fixes, improved test coverage, and better documentation. Please note that SciPy 0.7.0 requires Python 2.4 or greater (but not Python 3) and NumPy 1.2.0 or greater.

For information, please see the release notes or my previous post. You can download the release from here. Thanks to everybody who contributed to this release.

Wednesday, February 4, 2009

When will NumPy (and SciPy) migrate to Python 3?

Python 3.0 was released on December 3rd, 2008. This release is a major redesign of the language, which intentionally breaks compatibility with the 2.x series of releases. Now that it has been released, many projects have to decide how and when to migrate to Python 3.

The Python developers have attempted to make this transition as painless as possible. For instance, Python 2.6 helps simplify the migration path from the 2.x to the 3.x release series. Python 2.6 incorporates everything from 3.0 that doesn't introduce incompatibilities with the 2.x series. It also can be run with a -3 switch to warn about what will no longer work in Python 3. The developers also provide a Python program, called 2to3, to automatically convert Python 2.x source code to valid 3.x code.

The suggested strategy for migrating to Python 3 is essentially:
  1. port to 2.6
  2. fix all the warnings raised by the -3 switch
  3. run 2to3
  4. fix any remaining issues
A major prerequisite for this transition is excellent test coverage. Both NumPy and SciPy currently lack comprehensive test coverage, but we are making major improvements in this area. Over the last year, we have implemented a new testing framework based on nose and have doubled the number of tests for both projects. Over the next year, we will need to continue this trend and expand our test coverage even more.

While the above procedure of using the 2to3 tool works relatively for pure Python code, there is no automatic conversion tool for extension code. NumPy is mostly written in C and makes extensive use of the Python C-API. So converting NumPy will require much more than running the 2to3 tool. Once NumPy has been successfully ported, we will port SciPy to Python 3. Porting SciPy should be considerably easier. Regardless before porting either project to Python 3, we will need to ensure that both projects fully support Python 2.6.

Porting NumPy/SciPy to Python 2.6

Porting to Python 2.6 is a very pressing issue as at least one Linux distribution (openSUSE 11.1) has all ready moved to Python 2.6. Fedora 11 (scheduled to be released on 5/26/2009) will be based on Python 2.6.

The main issue with 2.6 support is NumPy. Over the last month or so, there has been a significant focus on making both NumPy and SciPy compatible with Python 2.6 largely thanks to the efforts of David Cournapeau. For example, the upcoming SciPy 0.7.0 release has replaced md5 and popen4 with hashlib and subprocess respectively (unfortunately, it appears that subprocess has a potential race condition). On UNIX (including Mac OS X), NumPy 1.2.1 mostly works under Python 2.6. On Windows, however, 1.2.1 has a number of problems related to the compilation process. Fixing these compilation issues required some fairly extensive changes and, thus, will not be included in a 1.2.x bug-fix release. However, these issues have mostly been addressed on the development trunk and will be included in the upcoming NumPy 1.3 release.

Hopefully, we will have a beta release of NumPy 1.3 out in a few weeks.  And we should have a release candidate out shortly after. If all goes well, both NumPy and SciPy will be Python 2.6 compatible for all platforms by March.

Porting NumPy/SciPy to Python 3

Once we finish porting to Python 2.6 and remove all the warnings raised by the -3 switch, we will be ready to start seriously planning to port NumPy to Python 3. Supporting Python 3 will require significant effort, since a lot of C code has to be ported. We have taken a preliminary look at what this port will entail and have identified at least the following issues to be addressed:
  • PyNumberMethods has changed: nb_divide, nb_coerce, nb_oct, nb_hex, and nb_inplace_divide have been removed
  • PyObject_VAR_HEAD has changed to conform to standard C
  • PyString_* is gone, all occurrences will need to be replaced with PyUnicode_* or PyBytes_*
  • PyInt_* is gone, all occurrences will need to be replaced with PyLong_*
  • Buffer interface has changed; this is a fairly big change and will require the most work
Given the amount of developer effort we currently have, it is difficult to imagine how we would be able to reasonably support two development branches (i.e., one for Python 2 and another for Python 3) for any significant amount of time. Obviously things may change; but, at this point, it looks like once we port NumPy to Python 3 we will only make bug-fix releases supporting Python 2.

Before porting to Python 3, we will be paying close attention to how the major Linux distributions will be handling the transition to Python 3. Fedora developers have started a lively discussion about how they will handle the transition to Python 3. The discussion indicates that there is a reluctance on their part to support both Python 2 and 3 in the same release.

With the last release of NumPy (1.2) and the upcoming SciPy release (0.7) we dropped support for Python 2.3. Since many scientists may not be able to quickly upgrade to Python 3, we will be reluctant to drop support for Python 2 for some time. Given our desire to provide the newest releases of NumPy and SciPy to as many users as possible, we will be closely listening to them to determine how quickly we can move to Python 3 and drop support for Python 2. And we will use this time to continue adding features, fixing bugs, improving documentation, and perhaps most importantly extending our test coverage.

It is likely that NumPy 1.4 and SciPy 0.8 (I am hoping that we will be able to release both by the end of 2009 or early 2010) will be based on Python 2. The following releases, NumPy 1.5 and SciPy 0.9, would be the earliest point that I can see us switching to Python 3.

Over the last year there has been a lot of discussion about the transition to Python 3 on the mailing lists, at our annual conference, during coding sprints and planning meetings, as well as private conversations. One topic that has come up is whether this transition would be a good opportunity to simultaneously do a major redesign of NumPy.

While there is a temptation to take advantage of this opportunity for a major release, we quickly realized that doing so would be a huge mistake. First, it would make it difficult for scientists and researchers to isolate the root cause of errors in their code when porting to the new release. Is the problem with the Python 3 port or the port to the new NumPy release? Second, it would be extremely poor community behavior. If other major packages succumb to this temptation as well, switching to Python 3 will become an increasingly daunting task for all the code out there, which use these packages. So when we port NumPy to Python 3, we will do so in a release that includes no API or ABI changes not strictly related to Python 3.

Friday, January 23, 2009

SciPy 0.7.0 release candidate

I just released the second release candidate for SciPy 0.7.0. Due to an issue with the Window's build scripts, the first release candidate wasn't announced. Unless a major regression or release blocker is discovered, this will become the 0.7.0 final release in a few weeks.

For more information, please see the release notes and my previous blog post. You can download the release from here.

Sunday, January 18, 2009

Matrix SIG archive (1995-2000) back online

I have recently been interested in the early history of multidimensional array support in Python. Most of the early development was discussed on the Matrix SIG. In January 2000, the Matrix SIG was retired and further discussion moved to the numpy-discussion mailing list hosted at sourceforge. More recently the mailing list moved to the SciPy server, which is hosted by Enthought. The archive from 2000 to the present is here.

Unfortunately, the Matrix SIG archive (1995-2000) was missing. And had been missing since at least 2005 when Robert Kern noticed that it had disappeared. Thanks to Skip Montanaro, Brad Knowles, and Barry Warsaw the archive has been restored.

Monday, January 12, 2009

SciPy 0.7 coming soon . . .

After 16 months of hard work the next stable release of scipy is almost ready to be tagged. This is the most significant scipy release in several years. It contains many new features, numerous bug-fixes, improved test coverage, and better documentation.

The scipy developer community started porting scipy to numpy at the end of 2005. Most of the work during 2006-2007 was focused on porting and bug-fixes and culminated in scipy 0.6. However, since the 0.6 release, the development effort has shifted from porting and maintenance to a much greater focus on infrastructure, architecture, and functionality.

One of the most important developments during the last year has been the extensive work on the testing and documentation framework. The improvements to testing and documentation first appeared in the numpy 1.2 release.

Our new testing infrastructure is based on the nose testing framework. This testing framework makes writing tests much easier than previously. The simplified testing framework as well as an increasing recognition of the importance of unit testing among the scipy developers has led to a doubling of the number of tests since the last stable release.

This release also brings immense documentation improvements. You can now view the scipy reference manual online or download it as a PDF file. The new reference guide was built using the popular Sphinx tool. We have also updated the scipy tutorial, which hadn't been touched for several years. Both the reference manual and the tutorial are easily editable using our web-based documentation editor. If you find want to improve the documentation, please register a user name in our web-based editor and correct the issues.

In addition to huge improvements in the testing and documentation infrastructure, this release cleans up a number of annoyances and removes old cruft. There have been a number of deprecations and well-documented API changes in this release. This is also the first stable release since the sandbox was removed.

We also did a fairly extensive review of the code to make sure that all the code is correctly licensed. About a year ago, I noticed that there was code not licensed under the revised BSD license. I was trying to "correct" the license information for the scipy Fedora package, but was told that the Fedora packaging policy required including all licenses found in a package. After doing some grepping and looking at svn commit logs, it was easy to figure out what code needed to be relicensed and who had committed the code. Fortunately I was able to track everyone down and everyone kindly agreed to relicense their code. There was also some code included in scipy derived from "Numerical Recipes in C" code. Unfortunately, the "Numerical Recipes in C" code doesn't permit redistribution. The scipy developers quickly reimplemented the offending code from scratch. With the 0.7 release, scipy only includes code licensed under the revised BSD.

The 0.7 release should be out by the end of the month and we have all ready started working on the next feature release. During the development of the 0.7 release, there has been a rapid increase in community involvement and numerous infrastructure improvements to lower the barrier to contributions (e.g., more explicit coding standards, improved testing infrastructure, better documentation tools). I look forward to seeing this trend continue and invite everyone to become more involved. To learn more about the changes in the 0.7 release, please see the release notes.