Saturday, January 30, 2010

Installing PostgreSQL on Mac OS X

PostgreSQL provides binaries for major operating systems, so installation is as simple as download, mount the dmg, and... read the README. Notably, it seems the amount of shared memory needs to be adjusted:

PostgreSQL One Click Installer README
=====================================

Shared Memory
-------------

PostgreSQL uses shared memory extensively for caching and inter-process
communication. Unfortunately, the default configuration of Mac OS X does
not allow suitable amounts of shared memory to be created to run the
database server.

Before running the installation, please ensure that your system is
configured to allow the use of larger amounts of shared memory. Note that
this does not 'reserve' any memory so it is safe to configure much higher
values than you might initially need. You can do this by editting the
file /etc/sysctl.conf - e.g.

% sudo vi /etc/sysctl.conf

On a MacBook Pro with 2GB of RAM, the author's sysctl.conf contains:

kern.sysv.shmmax=1610612736
kern.sysv.shmall=393216
kern.sysv.shmmin=1
kern.sysv.shmmni=32
kern.sysv.shmseg=8
kern.maxprocperuid=512
kern.maxproc=2048

Note that (kern.sysv.shmall * 4096) should be greater than or equal to
kern.sysv.shmmax. kern.sysv.shmmax must also be a multiple of 4096.

Having previously (more than once) tweaked settings I didn't understand and ended up in... unpleasant... configurations, I step carefully now.

Naively, some of these make sense. kern.maxproc and kern.maxprocperuid are self-evidently maximum processes the kernel should allow, and maximum per user account. My current settings are 532 and 266, respectively. Since I'm not going to be running a public webserver off of this machine, simply a server to do testing locally, I doubt very much that I'll spawn hundreds of processes, so these are probably fine the way they are.

kern.sysv.shm* must refer to shared memory, and by the comment, shmmax is in bytes, while shmall is in 4096-byte blocks. The others are more obscure, a minimal limit, something about segments (likely), and something else.

Let's ask Dr. Google. The number one hit for 'kern.sysv.shmmni' is a 318 Tech Journal entry titled 'Shared Memory Settings Explained'.

Shared memory is a method of inter-process communication (IPC), where two processes communicate with each other through shared blocks of RAM. Because communication is resident in RAM, shared memory allows for very fast communication between processes. There are significant drawbacks to shared memory; one obvious limitation is that all communicating processes must exist on the same box. Additional complexities with the implementation of shared memory means that it is typically relegated to lower-level, performance oriented systems, such as databases or backup systems.

In OS X, these settings MUST be tweaked if you are expecting to backup significant amounts of data with any semblance of speed or stability. I can confirm that both TiNa and NetVault use shared memory for IPC. Other products such as Retrospect or PresStore utilize other IPC methods, such as named pipes.

kern.sysv.shmall
shmall represents the maximum number of pages able to be provisioned for shared memory. It determines the total amount of shared memory that the system can allocate. To determine total system shared memory, multiply this value by the size of the page file. The page file size can be determined via `vm_stat` or `getconf PAGE_SIZE`. A typical page size is 4KB, 4096 bytes.
In OS X, Apple uses extremely conservative settings for shmall. At 1024, OS X defaults to only 4MB of shared memory.

kern.sysv.shmseg
shmseg represents the maximum number of shared memory segments each process can attach. Default in OS X is 8.

kern.sysv.shmmni
shmmni limits the number of shared memory segments across the system, representing the total number of shared memory segments. Default in OS X is 32.

kern.sysv.shmmin
shmmin is the minimum size of a shared memory segment, this should pretty much never need modification. Default is 1.

kern.sysv.shmmax
shmmax is the maximum size of a segment. Default in OS X is 4 MB, 4194304.

Suggested Settings:

512MB of shared memory
kern.sysv.shmall: 131072
kern.sysv.shmseg: 32
kern.sysv.shmmni: 128
kern.sysv.shmmin: 1
kern.sysv.shmmax: 536870912

1GB Shared memory
kern.sysv.shmall: 262144
kern.sysv.shmseg: 32
kern.sysv.shmmni: 128
kern.sysv.shmmin: 1
kern.sysv.shmmax: 1073741824

The defaults listed there are exactly what I see when I

$ sysctl -a | sort | grep kern.sysv

I have a 2 GB system, so extrapolating the simple doubling of some values from the 512 MB to 1 GB cases to my 2 GB case, I would have:

kern.sysv.shmall: 524288
kern.sysv.shmseg: 32
kern.sysv.shmmni: 128
kern.sysv.shmmin: 1
kern.sysv.shmmax: 2147483648

That's actually more than is recommended in the PostgreSQL readme, so I'll use the values in the latter, on the principle that the least sufficient value is the safest.

In the README is a link into the PostgreSQL documentation about the shared memory settings. The exposition of their purpose isn't as straightforward as 318 Tech Journal's, but they do have this to add about later versions of OS X:
In OS X 10.3.9 and later ... you can create a file named /etc/sysctl.conf, containing variable assignments such as:

kern.sysv.shmmax=4194304
kern.sysv.shmmin=1
kern.sysv.shmmni=32
kern.sysv.shmseg=8
kern.sysv.shmall=1024

.... Note that all five shared-memory parameters must be set in /etc/sysctl.conf, else the values will be ignored.

Beware that recent releases of OS X ignore attempts to set SHMMAX to a value that isn't an exact multiple of 4096.

SHMALL is measured in 4 kB pages on this platform.

That done, system rebooted, the installer ran without incident.

The next steps to getting Nuj up and running on my Mac are setting up the database, setting environment variables, and porting the helper scripts that I've written.

Installing Django 1.0 on OS X Leopard

I'm setting up my Mac as a development host for Nuj.

On linux, my dev setup is emacs, git, django, postgresql, the source tree, and a few scripts. OS X comes with emacs, and git was a simple dmg drag-drop. Django is a bit more complicated.

I'll follow the Quick Install Guide provided by Django.

1. Install Python. OS X comes with Python.

$ python -c "from distutils.sysconfig import get_python_lib; print get_python_lib()"
/Library/Python/2.6/site-packages

Django works with Python versions 2.3 to 2.6, so this is fine.

2. Set up a database. Since I have version 2.6 (> 2.5), I can skip this step for now. I'll detail setting up PostgreSQL in a sequel.

3. Remove any old versions of Django. None present.

4. Install Django. The instructions provide three choices, either a distribution supplied version, an official release, or the main development trunk. I don't believe that Apple supplies a version of Django, and that this is relevant for Linux systems. I'd rather not be on the bleeding edge, so that leaves an official release. Clicking through the second option's link drops me onto the installation page and provides four steps:

1. Download the latest official release. Django-1.1.1.tar.gz. Check.

2. Unpack the tarball. That requires a single click on the tarball in the downloads stack.

3. Cd to the directory created by tar.

4. Install with the usual Python idiom:

$ sudo python setup.py install

That, apparently, is it.