Model lognormal distributed data with a gamma distribution:
Compare the distributions of the simulation and estimated distributions:
The number of accident claims per policy per year from an insurance company:
Model the data by a logarithmic series distribution since most policies have at most one claim:
Get word length data for several languages:
Model the word lengths for each language as binomially distributed:
Compare the actual and estimated distributions:
The word count in a text follows a Zipf distribution:
Fit a
ZipfDistribution to the word frequency data:
Compare the frequency histogram with the estimated distribution:
EstimatedDistribution can be used with constructs like
MixtureDistribution to create multimodal models:
The magnitudes of earthquakes in the United States in the years 1935-1989 have two modes:
Fit distribution from possible mixtures of one
NormalDistribution with another:
Compare the histogram to the PDF of the estimated distribution:
Find the probability of an earthquake of magnitude 7 or higher:
Find the mean earthquake magnitude:
Simulate magnitudes of the next 30 earthquakes:
Model monthly maximum wind speeds in Boston:
Compare the empirical quantiles and those for the fitted distributions to see where the models deviate from the data:
Model incomes at a large state university:
Assume the salaries are Dagum distributed:
Assume they follow a more general Pareto distribution:
Compare the subtle differences in the estimated distributions:
Use a beta distribution to model the proportion of Dow Jones Industrial stocks that increase in value on a given day:
Find daily change for Dow Jones Industrial stocks:
Filter out missing data and pad with zeros:
Calculate the daily ratio of companies with an increase in value:
Find parameter estimates, excluding days with zero or all companies having an increase in value:
Compare quantiles to see that the data and estimated distribution match well:
The average city and highway mileage for midsize cars follows a binormal distribution:
Assume city and highway miles per gallon are normally distributed and correlated:
Show the distribution of city and highway mileage:
Visualize the joint density with contours on a logarithmic scale:
The data contains waiting times in days between serious (magnitude at least 7.5 or over 1000 fatalities) earthquakes worldwide, recorded from 12/16/1902 to 3/4/1977:
Model waiting times by an
ExponentialDistribution:
Estimate the average and median number of days between major earthquakes:
The number of earthquakes per year can be modeled by
SinghMaddalaDistribution:
Fit the distribution to the data:
Compare the data histogram with the PDF of the estimated distribution:
Find the probability of at least 60 earthquakes in the U.S. in a year:
Mixtures can be used to model multimodal data:
A histogram of waiting times for eruptions of the Old Faithful geyser exhibits two modes:
Fit a
MixtureDistribution to the data:
Compare the histogram to the PDF of the estimated distribution:
Find the probability that the waiting time is over 80 minutes:
Simulate waiting times for the next 60 eruptions:
Lognormal distribution can be used to model stock prices:
Fit the distribution to the data:
Observe that the quantiles for the data and distribution match well except for the largest values:
Consider the annual minimum daily flows given in cubic meters per second for the Mahanadi river:
Model the annual minimum mean daily flows as a
MinStableDistribution:
Compare the histogram of the data to the PDF of the estimated distribution:
Simulate annual minimum mean daily flows for the next 30 years:
Use a Pareto distribution to model Australian city population sizes:
Estimate the probability that a city has a population of at least 10,000 people:
Compute the probability based on the original data: