Measurements that are consistent with our subjective is important, but man is that a tough topic to tackle. There are a number of problems with the idea that make it as much an art as it is a science. I think groups like Harman have helped really make it a better science. I do a similar kind of work, measuring human perception in an objective way.
Let's take something concrete for a minute and make it a little fuzzy to show the point. Depression is a disorder, its a kind of mental disease. It isn't caused by bacteria or a virus, nor can we do a blood test to measure its existence. Yet clearly some people are depressed and some people are not. It's a thing and its not good when people have it. How do we measure it. Well, I can create a measurement tool to examine a person's behavior and develop a scoring rubric that assesses if someone is depressed. I can create a questionnaire as well, people fill it out and I create cut-off scores that reflect different levels of depression. On the face of it, these seem like objective scientific tools, but who says those scores reflect depression? How can I assert that? There are ways to do that of course, we can validate the tool against a known accurate measure. For example multiple trained mental health professionals might use an accepted diagnostic criteria to classify someone as depressed or not and we can compare that score to my tools. If the accuracy of my tool, measured by its specificity and sensitivity scores are sufficiently high, I can assert these as objective measures. Acoustic measurements are a step beyond that, they are more like a blood test, but the problem is, who says those measurements reflect what I hear? Who says that flat is better? We can only know that by developing a criteria for interpreting the measurements and validating them against a known accurate tool, what trained listeners tell us they prefer. I think the problem is we have a lot of bad objective tools. THD and Harmonic Distortion is a bad objective tool because it measures something that doesn't correlate well with what we hear. There are better objective tools but they simply aren't used. Our overall room measurement technique may also be a bad objective tool. Binaural measurement and 3D acoustic measurement are more accurate representations and bring in additional important information. I also have concern with some of the validations. My main work right now is looking at the non-average in a study. Studies by design, including those that validated what is "good sound" rely on an average. What did the average person prefer? We don't care what the unusual person preferred. The assumption is that the average really reflects what the majority of people like. That may not be a valid assumption at all, and in fact, even if it is, it may leave out a lot of non-average people who like something different for a valid reason. In my own work we are designing different trial designs that allow us to untangle why our average doesn't apply to everyone and find patterns of association between subgroups which can be directly explained by a particular mechanism. It may be that there are in fact different preferences that are not actually personal (meaning everyone is unique) but specific to measurable attributes of a person, their speaker type, or their room.
Measurements are objective, we all know this. But there is a problem. Not everyone is equally good at taking them or interpreting them. So while the measurement is objective, it may not be right or valid. Now even if we can assume that the measurement is right, it IS NOT an objective representation of what we hear. Our ears are a) not in the same place as a single microphone position, and b) does not hear the way a microphone does. We have to approximate through measurements what our ears are actually hearing and then use our own understanding of hearing to interpret what the objective data is telling us. That is really the hard part. Averaging over an area around our head is thought to do that, but I would just say its still an imperfect approximation. I used to use the exact method you use for in room measurements, which I took from Stereophile. I stopped doing it a while back and began using a more methodical approach. Partly because handling the mic alone can cause problems with the measurement. Best to avoid any kind of extra handling and potential shadow in the response. Instead what I did was start doing as I mentioned, I setup a kind of grid around my head, including dead center. Then I just took lots of measurements over and over. This let's me not only average all of them, which is the same as the spatial averaging with an RTA, but also examine the individual ones. I like having all that information saved. Of course, none of this gets at interpretation and its the interpretation that gets at what we hear. That's really tricky, I'm not even sure the research is so strong there yet. We know what kind of in room measurements people perceive as better vs not. Smooth and with a 1db's per octave rise between 20khz and 20hz, meaning the bass should be 10db's louder at 20hz than the treble at 20khz. That's just one piece of information though.
I made the argument in my article on distortion that linear distortion is the biggest concern and non-linear distortion is a minor concern. That is true based on the assumption of a competently designed speaker system. What if they aren't? What if its someone's DIY affair and they didn't know what they were doing? What if its a car and the system uses all active crossovers and the person didn't know what they are doing? You could make a system that measures with a perfect frequency response and have it still sound terrible. In fact, I recently listened to a speaker that, while doing a lot of things right, sounded bad to me and another reviewer. I didn't have a chance to fully explore but suspect there is a distortion of some kind I didn't measure. It's frequency response was very flat and its harmonic distortion was unusually low. Those weren't the problem. Yet the speaker had a kind of harshness to the treble in the 3-5khz range.