## Histograms – Remember the GD library?

Published: December 23rd, 2008 by:

Better recognized perhaps as "bar graphs" to most people, in the world of statistics, histograms provide a graphic representation of a set of data.  This graph can be used to make inferences about the data and draw conclusions about the sample and population.  In this article, we'll dive into a few new functions available as part of the GD library that are particularly useful for this application.

As always, we will take a step-by-step approach toward handling the complicated script, and I would like to preface by saying that my “finished” code is not the best, but the use of the main functions I will discuss here is what is most important.  Let’s start by setting up the image and creating the axes.  The math is not too important and can be computed in many ways.  Depending on the image size and spread you want, the numbers can certainly change.  Just remember that the upper-left corner of the image is (0,0) and the bottom-right corner is (max,max), so adjust your coordinates accordingly.  With the following code, we will analyze the set of data mathematically, determining the individual classes (which will each be represented with a bar), the frequency of each, the number of pixels on the image that will represent a frequency of ‘1’ (Yscl), and the width of the image based on the number of classes there are.  Then, we will draw the x and y-axes.

```
<?php
\$x = explode("a",\$_GET['x']);
\$numClasses = \$_GET['nC'];
\$min = (float)\$x[0];
\$max = (float)\$x[0];
foreach (\$x as \$val) {
if ((float)\$val<\$min) { \$min = \$val; }
if ((float)\$val>\$max) { \$max = \$val; }
}
\$c = ceil(((\$max-\$min)/\$numClasses));
for (\$z=0; \$z<\$numClasses; \$z++) {
\$class{\$z} = array(((\$z*\$c)+\$min),(((\$z+1)*\$c)+(\$min-1)),0);
foreach (\$x as \$val) {
if ((\$val >= \$class{\$z}[0]) &amp;amp;amp;amp;&amp;amp;amp;amp; (\$val <= \$class{\$z}[1])) {
\$class{\$z}[2]++;
}
}
}
\$maxCount = 0;
for (\$z = 0; \$z<\$numClasses; \$z++) {
if (\$class{\$z}[2] > \$maxCount) {
\$maxCount = \$class{\$z}[2];
}
}
\$Yscl = (500/\$maxCount);
\$width = (70+(\$numClasses*70));

\$histogram = imagecreate(\$width,560);

//Defining our colors here...you remember how to do that, right?

\$white = imagecolorallocate(\$histogram,255,255,255);
\$black = imagecolorallocate(\$histogram,0,0,0);
\$blue = imagecolorallocate(\$histogram,0,0,255);
imageline(\$histogram,50,10,50,510,\$black);
imageline(\$histogram,50,510,\$width,510,\$black);

...

?>

```

Next, we’ll handle a quick, rather minor part, labeling the histogram using the imagestring() function.  There are several functions available with the GD library, including imagefttext(), imagepstext(), and imagettftext(), so pick the one that’s best for you.  An accomplice to this function, we will also use imageloadfont() to add a basic font of our own to use.

```
<?php

imagestring(\$image,\$font,\$x,\$y,"Text",\$color);

?>

```

There are several specifications for the font file that you load, so I would read up on the specifications depending on the version of GD you have and what you want to accomplish.  We aren’t going to add any text to the chart yet, but I’ll add that on in the following code.  Finally, let’s actually get on to the most important part, creating the bars, using the imagefilledrectangle() function:

```
<?php

imagefilledrectangle(\$image,\$x1,\$y1,\$x2,\$y2,\$color);

?>

```

The two coordinates that are needed should be opposite corners (upper-left and bottom-right,upper-right and bottom-left, etc), though it is not important which two.  As you should know by now, \$image refers to the image you create and \$color to a predefined color that the rectangle will be filled with.  Computing the sizes of each rectangle and adding in the proper labels, here’s the additional code:

```
<?php

imagestring(\$histogram,\$font,((\$width/2)-25),527,"Classes",\$blue);
for (\$z=0; \$z<\$numClasses; \$z++) {
imagefilledrectangle(\$histogram,(70+(69*\$z)),509,((69*\$z)+120),(509-(\$Yscl*\$class{\$z}[2])),\$blue);
imagestring(\$histogram,\$font,(67+(69*\$z)),511,\$class{\$z}[0]."-".\$class{\$z}[1],\$black);
imagestring(\$histogram,\$font,30,(509-(\$Yscl*\$class{\$z}[2])),\$class{\$z}[2],\$blue);
imageline(\$histogram,47,(509-(\$Yscl*\$class{\$z}[2])),53,(509-(\$Yscl*\$class{\$z}[2])),\$black);
}

```

As you can see, we used the imagestring() function to label the x-axis, label each class of the histogram, and label the necessary heights on the y-axis to make it easy-to-read.

All-in-all, the code I use reads as follows, and I think you will see that with exception perhaps to adjusting the coordinates to fit with the image, it is not too complicated.

```
<?php
\$x = ;//Array of x-values
\$numClasses = ;//Desired number of classes
\$min = (float)\$x[0];
\$max = (float)\$x[0];
foreach (\$x as \$val) {
if ((float)\$val<\$min) { \$min = \$val; }
if ((float)\$val>\$max) { \$max = \$val; }
}
\$c = ceil(((\$max-\$min)/\$numClasses)); //Width of each class
for (\$z=0; \$z<\$numClasses; \$z++) {
\$class{\$z} = array(((\$z*\$c)+\$min),(((\$z+1)*\$c)+(\$min-1)),0);
foreach (\$x as \$val) {
if ((\$val >= \$class{\$z}[0]) &amp;amp;amp;amp;&amp;amp;amp;amp; (\$val <= \$class{\$z}[1])) {
\$class{\$z}[2]++;
}
}
}
\$maxCount = 0;
for (\$z = 0; \$z<\$numClasses; \$z++) {
if (\$class{\$z}[2] > \$maxCount) {
\$maxCount = \$class{\$z}[2];
}
}
\$Yscl = (500/\$maxCount);
\$width = (70+(\$numClasses*70));
\$histogram = imagecreate(\$width,560);
\$white = imagecolorallocate(\$histogram,255,255,255);
\$black = imagecolorallocate(\$histogram,0,0,0);
\$blue = imagecolorallocate(\$histogram,0,0,255);
imageline(\$histogram,50,10,50,510,\$black);
imageline(\$histogram,50,510,\$width,510,\$black);
imagestring(\$histogram,\$font,((\$width/2)-25),527,"Classes",\$blue);
for (\$z=0; \$z<\$numClasses; \$z++) {
imagefilledrectangle(\$histogram,(70+(69*\$z)),509,((69*\$z)+120),(509-(\$Yscl*\$class{\$z}[2])),\$blue);
imagestring(\$histogram,\$font,(67+(69*\$z)),511,\$class{\$z}[0]."-".\$class{\$z}[1],\$black);
imagestring(\$histogram,\$font,30,(509-(\$Yscl*\$class{\$z}[2])),\$class{\$z}[2],\$blue);
imageline(\$histogram,47,(509-(\$Yscl*\$class{\$z}[2])),53,(509-(\$Yscl*\$class{\$z}[2])),\$black);
}
imagepng(\$histogram);
imagedestroy(\$histogram);
?>

```

With this, a great-looking histogram is achieved, and the colors and text can easily be modified to combine for the look desired.  PHP and the GD library once again save the day, making it easier for statisticians to graphically represent data.

Check out my sample script, where you can enter a set of data and a number of classes (can be random) and be presented a basic histogram.

Wordpress doesn't like it when you post PHP code. Go save your code at pastebin, and post the link here.