Friday, January 23, 2009
SciPy 0.7.0 release candidate
For more information, please see the release notes and my previous blog post. You can download the release from here.
Sunday, January 18, 2009
Matrix SIG archive (1995-2000) back online
Unfortunately, the Matrix SIG archive (1995-2000) was missing. And had been missing since at least 2005 when Robert Kern noticed that it had disappeared. Thanks to Skip Montanaro, Brad Knowles, and Barry Warsaw the archive has been restored.
Monday, January 12, 2009
SciPy 0.7 coming soon . . .
After 16 months of hard work the next stable release of scipy is almost ready to be tagged. This is the most significant scipy release in several years. It contains many new features, numerous bug-fixes, improved test coverage, and better documentation.
The scipy developer community started porting scipy to numpy at the end of 2005. Most of the work during 2006-2007 was focused on porting and bug-fixes and culminated in scipy 0.6. However, since the 0.6 release, the development effort has shifted from porting and maintenance to a much greater focus on infrastructure, architecture, and functionality.
One of the most important developments during the last year has been the extensive work on the testing and documentation framework. The improvements to testing and documentation first appeared in the numpy 1.2 release.
Our new testing infrastructure is based on the nose testing framework. This testing framework makes writing tests much easier than previously. The simplified testing framework as well as an increasing recognition of the importance of unit testing among the scipy developers has led to a doubling of the number of tests since the last stable release.
This release also brings immense documentation improvements. You can now view the scipy reference manual online or download it as a PDF file. The new reference guide was built using the popular Sphinx tool. We have also updated the scipy tutorial, which hadn't been touched for several years. Both the reference manual and the tutorial are easily editable using our web-based documentation editor. If you find want to improve the documentation, please register a user name in our web-based editor and correct the issues.
In addition to huge improvements in the testing and documentation infrastructure, this release cleans up a number of annoyances and removes old cruft. There have been a number of deprecations and well-documented API changes in this release. This is also the first stable release since the sandbox was removed.
We also did a fairly extensive review of the code to make sure that all the code is correctly licensed. About a year ago, I noticed that there was code not licensed under the revised BSD license. I was trying to "correct" the license information for the scipy Fedora package, but was told that the Fedora packaging policy required including all licenses found in a package. After doing some grepping and looking at svn commit logs, it was easy to figure out what code needed to be relicensed and who had committed the code. Fortunately I was able to track everyone down and everyone kindly agreed to relicense their code. There was also some code included in scipy derived from "Numerical Recipes in C" code. Unfortunately, the "Numerical Recipes in C" code doesn't permit redistribution. The scipy developers quickly reimplemented the offending code from scratch. With the 0.7 release, scipy only includes code licensed under the revised BSD.
The 0.7 release should be out by the end of the month and we have all ready started working on the next feature release. During the development of the 0.7 release, there has been a rapid increase in community involvement and numerous infrastructure improvements to lower the barrier to contributions (e.g., more explicit coding standards, improved testing infrastructure, better documentation tools). I look forward to seeing this trend continue and invite everyone to become more involved. To learn more about the changes in the 0.7 release, please see the release notes.
Monday, July 7, 2008
SciPy 2008 Conference Program posted
Since 2002, the conference has been driven almost entirely by Enthought (Austin, TX) with on-site co-ordination and assistance by the Center for Advanced Computing Research (Caltech, Pasadena, CA). This year the community took a much larger role in conference planning. I am co-chairing the conference with Travis Vaught of Enthought. Gaël Varoquaux and Stéfan van der Walt have invested a huge amount of time in developing a TurboGears conference website. We have also created a much larger program committee and tutorials committee with members from Europe, Africa, and North America. You can see the entire list of organizers here. During past conferences Enthought has sponsored a small number of students. This year we are also very excited that the Python Software Foundation (PSF) has agreed to help Enthought fund more student sponsorships for this year's conference–bringing the number of students to ten for the first time.
The tutorial committee has decided to offer two tutorial tracks this year, rather than one. The first is a two day in-depth introductory course to scientific computing with Python. The advanced track consists of eight two-hour sessions covering a variety of topics from building extensions to graphical user interfaces.
The program committee has done an excellent job putting together a very interesting schedule. We are very fortunate to have Alex Martelli for our keynote address this year. Although the conference will last the same number of days, there will be a larger number of shorter talks this year (16 talks last year to 23 talks this year). This will be the first year that we'll be publishing a proceedings book for selected talks. Travis Vaught and I will be giving the first annual "State of SciPy" talk. We will also have an expert panel discussion at the end of the conference.
Finally, we are extending the post-conference code sprint from one to two days. Last year the coding sprint was very successful, but there was a feeling that one day was too short. We are hoping to get a large number of participants this year. We already have commitments from several core NumPy, SciPy, IPython, SymPy, Mayavi, Numscons, and ETS developers.
Early registration ends on Friday, July 11, 2008.
Saturday, December 29, 2007
NumPy/SciPy blog aggregator
Sunday, December 23, 2007
The end of the SciPy sandbox
The sandbox is currently creating more problems than it solves:
- Sandbox code limits group development, since it is often viewed as a place where a specific developer (or maybe a small group of developers) is experimenting. In fact, several of the packages are simply named after the developer. And branching would be a more appropriate way for experimental work done by a small group of developers.
- The ambiguous nature of the sandbox (i.e., in the SciPy trunk, but not in the release) plus a greater tolerance for broken code allows loose coding and documentation standards, which creates a barrier to inclusion in the core.
- Having packages included in the trunk implies that the code will eventually move into official releases; but several of the packages (e.g., old graphics packages) will not be included in future releases.
- Finally and most importantly, the sandbox leads to confusion and installation headaches. Users expect to have access to sandbox packages when they install SciPy binaries. But if they want to use a sandbox package, they are encouraged to download the source code, edit configuration files, and build a SciPy.
Eventually, we would like to see all of the following code/packages/functionality moved into scipy: arpack, buildgrid, constants, delaunay, ga, image, lowbpcg, montecarlo, netcdf, newoptimize, rbf, rkern, and spline. Most of this code will not likely be ready by the 0.7.0 release, so it will probably just be moved into a branch for now.
We would like to see all of the following code/packages/functionality moved into a scikit: ann, exmplpackage, fdfpack, multigrid, pyem, pyloess, svm, and timeseries. My next blog entry will be focused on Scikits.
The packages belonging to specific developers should probably be moved into a branch: cdavid, duard, oliphant, and rkern. And some of the developers are suggesting that numexpr be moved into a separate, stand-alone package. Finally, several packages can just be deleted: arraysetops, cow, gplt, maskedarray, plt, stats, wavelets, and xplt.
Monday, December 17, 2007
NumPy/SciPy strategic planning
Over the next few days I will be blogging about some of the things we discussed and started implementing.