Distribution fitting and histogram overlay (scaling matter) - MATLAB

owner

While trying to overlay pdf (probability density function) values on a histogram I face significant scaling issues as the histogram is barely visible on my chart. This might be due to scaling factors used in the below code, otherwise the density curves would be tiny. I am in a predicament and thus wondering if there is a better way to achieve this task without resorting to adhoc scaling factor?

bmin=min(b);
bmx=max(b);
nrow=length(b);
nbin=sqrt(nrow);
pd = fitdist(b,'Normal');
p = stblfit(b,'ecf');
x_pdf=[bmin:0.0025:bmax];
y=pdf(pd,x_pdf);
hist(b,nrow);
h = findobj(gca, 'Type','patch');
h.FaceColor=[0 0 0];
hold on;
scale = 0.156*max(y);
plot(x_pdf,y.*scale,'or');
hold on;
scale2 = 0.24*max(y);
plot(x_pdf,stblpdf(x_pdf,p(1),p(2),p(3),p(4)).*scale2,'k-');
legend('P&L distribution','Normal fit', 'ecf fit')

Thanks

hbaderts

As you are plotting the histogram, the y-axis represents the number of counts for a specific interval. If you have a longer or shorter input vector, the values of the histogram will be very different. The histogram can be used as an approximation to the probability density (PDF), but for that you need to scale it correctly. The integral of a PDF from -infinity to +infinity has to result in 1, so we need to scale the histogram accordingly.

You can still use the hist command, but instead of using it to generate the histogram plot, we get the count values from it. Then this vector can be scaled to have an integral of 1 by simply calculating the integral and dividing the vector by that.

% Generate some arbitrary gaussian distribution
b = randi(10) + randi(10) .* randn(10000,1);
bmin = min(b);
bmax = max(b);

% Calculate histogram
[counts,bins] = hist(b,100);

% Scale histogram to get the pdf
est_pdf = counts / sum(counts * mean(diff(bins)));

% Estimate pdf using fitdist
pd = fitdist(b,'Normal');
x_pdf = linspace(bmin,bmax,1000);
y_pdf = pdf(pd,x_pdf);

% Plot everything
figure;
hold on;
bar(bins,est_pdf);
plot(x_pdf, y_pdf, '-r');
hold off;

Note: I calculate the integral of counts by multiplying the counts with the mean interval width of the histogram, as not all intervals are exactly equally wide. This is an approximation of the integral, but it should be exact enough.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

From Dev

Fitting a distribution given the histogram using scipy

From Dev

Draw histogram with normal distribution overlay from data

From Dev

Students-t distribution and histogram overlay

From Dev

Combine Histogram and Cumulative Distribution Matlab

From Dev

Fitting a Gaussian to a histogram with MatPlotLib and Numpy - wrong Y-scaling?

From Dev

Fitting a Gaussian to a histogram with MatPlotLib and Numpy - wrong Y-scaling?

From Dev

Fitting the cumulative distribution function using MATLAB

From Dev

How to overlay a Seaborn jointplot with a "marginal" (distribution histogram) from a different dataset

From Java

Add a fitting function to histogram

From Dev

Histogram fitting with python

From Dev

Fitting to Poisson histogram

From Dev

SIFT parabola fitting of histogram

From Dev

Histogram and Gaussian fitting

From Dev

SIFT parabola fitting of histogram

From Dev

Fitting a Gumbel distribution with fitdistrplus

From Dev

Fitting an image to Gaussian distribution

From Dev

Fitting a weighted distribution in R

From Dev

Fitting a Custom Scipy Distribution

From Dev

Fitting the Poisson distribution

From Dev

Fitting distribution on curve

From Dev

fitting a distribution to survival curve

From Dev

Fitting data to weibull distribution

From Dev

Scaling of fitted pdf for a histogram

From Dev

fitting a cumulative line to histogram with matplotlib

From Dev

Scikit learn, fitting a gaussian to a histogram

From Dev

Histogram in MatLab

From Java

Scaling a normal distribution in Python

From Dev

Fit a distribution to a histogram

From Dev

Histogram - Grade Distribution