Preference Elicitation on Motor Trends Dataset
This vignette demonstrates how to use the
prefeR package on a real dataset. The
mtcars dataset provides us such an opportunity.
|Mazda RX4 Wag||21.0||6||160||110||3.90||2.875||17.02||0||1||4||4|
|Hornet 4 Drive||21.4||6||258||110||3.08||3.215||19.44||1||0||3||1|
If we wanted to give a user a list of their top five most preferred cars from the
mtcars dataset, there are three approaches we could take:
- Have our user manually rank all options.
- Make the user provide weights for the desirability of different car features, and calculate the weighted value of each option.
- Have the user compare a small number of alternatives, and derive their weights from those comparisons.
Option #1 quickly becomes an enormous burden on the user as the number of alternatives increases. Option #2 is difficult for the user to do and replicate. What exactly does it mean if the weight assigned to horsepower is double the weight assigned to fuel efficiency?
Option #3 is enabled by the preference elicitation package. To begin, we create a preference elicitation object and give it our data:
Now we can add in our Bayesian priors for the weights. Although it is difficult to determine weights exactly, usually one has some ballpark estimate for what they should be, and often one knows with certainty the sign of the weights: all else equal, everyone would prefer a more fuel efficient car. The
prefeR package contains three built-in priors:
Normal(mu, sigma)provides a one-dimensional Normal prior with mean mu and standard deviation sigma. This prior is useful if you have a good guess for what the weight should be, and an understanding of how much you expect to differ from that guess.
Exp(mu)provides a one dimensional Exponential prior with mean mu (not rate!). This prior is particularly useful if you deterministically know the sign of the weight, and have a guess for the value of the weight. The mean may be negative.
Flat()yields a completely agnostic, flat prior.
We can now add in our priors for our
p$priors <- c(Exp(1), # MPG Normal(), # Number of cylinders (Normal() = Normal(0, 1)) Normal(), # displacement Exp(2), # horsepower Normal(), # real axle ratio Normal(), # weight Exp(-3), # quarter mile time Normal(), # Engine type Normal(), # transmission type Normal(), # number of gears Normal() # number of carburetors )
Now, we can add in our user’s preferences:
p ## Preference elicitation object with: ## 32 observations of 11 variables. ## And the following preferences: ## Pontiac Firebird preferred to Fiat 128 ## Mazda RX4 preferred to Mazda RX4 Wag ## Merc 280 indifferent to Merc 280C
Now, we can infer what our attribute weights should be:
p$infer() ## mpg cyl disp hp drat wt ## 0.000000000 0.005892875 0.473346403 0.000000000 -0.001473219 -0.286931232 ## qsec vs am gear carb ## -0.284793197 -0.001473219 -0.001473219 -0.001473219 0.001473219
And we can get our top five cars:
p$rank()[1:5] ## Cadillac Fleetwood Lincoln Continental Chrysler Imperial Pontiac Firebird ## 216.8368 211.1522 201.8215 183.4207 ## Duster 360 ## 164.9131
Finally, we can figure out what query we should answer next:
p$suggest() ##  "Pontiac Firebird" "Cadillac Fleetwood"