**12 minutes**. Ask a Question.

Apologies in advance for this long intro, but I feel I need to give some context ...

<intro>

Suppose you've got a table that contains, among others, two fields. Let's call them

Be A1, A2 and A3 three values taken by

Let's say we are analysing a subset of the 9 possible combinations of the values taken by the two variables and we find the following number of occurrences:

A2 & B1 => 21 occurrences

A2 & B2 => 22 occurrences

A2 & B3 => 17 occurrences

</intro>

The following is what I've come up with, but I am not satisfied with my choice of verbes.

Any suggestions?

It's part of an academic article, so I'd like it to be fairly formal ... but natural!

Thank you very much!

<intro>

Suppose you've got a table that contains, among others, two fields. Let's call them

**A**and**B**.Be A1, A2 and A3 three values taken by

**A**and be B1, B2 and B3 three values taken by**B**.Let's say we are analysing a subset of the 9 possible combinations of the values taken by the two variables and we find the following number of occurrences:

A2 & B1 => 21 occurrences

A2 & B2 => 22 occurrences

A2 & B3 => 17 occurrences

</intro>

The following is what I've come up with, but I am not satisfied with my choice of verbes.

*The map also puts in evidence that there is no significant correlation between the two variables, as neither does any combination clearly*__prevail over/dominate__the others, nor is any combination__outweighed/dominated__by the others.Any suggestions?

It's part of an academic article, so I'd like it to be fairly formal ... but natural!

Thank you very much!

Comments (Page 2)

I didn't want to bore anybody with the details of the study. Now I feel I must defend my academic reputation, though ... so here they are.

First I (I'm the "GIS woman") built a large spatial dataset -- a GIS-based one, I mean. The original one contained 377 rows * 30 odds columns; that was later reduced to 167 rows * 7 columns. Each rows corresponds to a polygon in a map (city boundaries, in the real word) and each column to an attribute.

The correlations were found out by my colleague (he's the statistician) by means of a logistic regression model .

As a final step, I tried to put the outputs of the regression model back on the map, which means both looking for those polygons/rows for which the outcomes of the regression model hold true, and analysing their geographic distribution.

Not really exciting, uh? I expect people who have read all of this post to be sleeping by now ...

TanitThe reason I wrote both is that, as far as my understanding goes, that might not be true if you have more than just three combinations.

Let's take these, for instance (C1 ... C5 = combinations)

C1 = 100; C2 = 20; C3 = 23; C4 = 19; C5 =2

C1 predominates/prevails over the others

C5 is dominated(?)/outweighed(?) by the others

C1 = 80; C2 = 82; C3 = 79; C4 = 77; C5 =5

no single combination clearly predominates/prevails over the others

C5 is dominated(?)/outweighed(?) by the others

C1 = 100; C2 = 2; C3 = 3; C4 = 5; C5 =2

C1 predominates/prevails over the others

no single combination is dominated(?)/outweighed(?) by the others

Would you agree or is there something about those words (prevail over, predominate over, dominate, outweigh) I am missing?

Thank you!

Tanitallvalues cluster around the same frequency, a situation with no outliers ineitherdirection. I confess I was a little confused, but I believe I have it straightened out in my mind now.In your second case, C2 = 82 is much greater than C5 = 5. This sounds like C2 predominates over C5, and yet you say this case has no single combination that clearly predominates/prevails over the others. So I suppose you mean that no single combination clearly predominates/prevails over

allthe others. I think I might add "all".I think I may have originally mistaken your statement to mean that no two individual combinations could be chosen from among all combinations which were related as "very great" to "very small" (in the way that C2 and C5 are related). I realize now that that is not what you meant.

A bit off-topic: I imagine that you are just summarizing the mathematics at this point in your report, because I don't think you can actually express these relationships in words with the same precision that a mathematical formula could do. You can only give a general impression of what "much greater" or "much less" might mean compared to "a little greater" or "a little less". Within the set 80, 82, 79, 77, 5, for example, the general reader can pretty much guess how much wider the gap between 82 and 5 is compared to the gap between 80 and 79, but in the real-world cases, it seems to me that the statistical machinery is what sets the criteria between a "tight cluster" of data and a "loose cluster" of data, and thus sets what constitutes an outlier for your particular study.

Other than that, yes, I would agree.

CJ

CalifJimCJ

CalifJimI will add 'all', as you suggest.

The mathematics is summarized in the article by means of a table that also shows the significance of the correlations. I'm just trying to describe the map that sort of looks backwards at the outcomes of the statistics to see whether & where the model fits.

Luckily, I'm almost done [<:o)] ... so I won't annoy you all any longer with this stuff.

TanitWhat struck me as unusual about Tanit's description was that she was looking for proof of correlation at both extremities of the bell curve, as it were. The important thing was for a single pair of values to distance itself from the cluster either by having the most "occurrrences" or by having the fewest. Perhaps I misread, but I was convinced of that.

She said that there was no significant correlation between the two variables because (1) no single value pair predominates, and (2) no single value pair is predominated over by all the others.

It seemed to me that zero occurrences would have been ideal. But how can this be described as predominance? If there were an antonym for "predominates" I would have cheerfully used it.

Either I misread the whole thing, or I'm losing my mind.

AvangiAvangionevariable. Establishing correlation requires separate statistical techniques which deal with the relationships between two or more variables. These techniques may or may not make use of the mathematics relating to the curve of the normal distribution.I never did particularly well in statistics classes, so I'm probably not the one you should be listening to in any case. I really don't understand why the clustering of those "combinations" into groups that all have about the same value implies "no significant correlation". You would probably have to study all the relevant formulas that Tanit linked us to to get a handle on it. I, for one, am not going to go that far. If you decide to pursue it further, good luck!

CJ

CalifJimI was just using the bell curve figuratively to explain getting away from the cluster in either direction.

I'm still in the dark about how that leads to correlation.

If you have one "individual" all alone at the extreme right hand side of the bell curve, he clearly predominates in the sense we've described here. In this case, that would represent a single value pair having the highest number of "occurrences."

But Tanit seemed to be equally interested in the left hand extreme.

I'll listen to you any time on any subject.

Avangi