# P-value meaning + rows highlighting

Hello Micha,

I want to clarify the meaning of the P-value (P-wert), because there are still some confusing parts.

As you wrote in the handbook, the P-value is logarithm of probability, that the individual observation belongs to the base population. If this logarithm is lower, than logarithm of the significance level alpha, then the zero hypothesis should be rejected (= the observation did not belong to the base population/observations set). Is this correct, please?

As far if it is correct, then there is not clear, how the rows highlighting in accordance to P-value works. The previous statement splits the interval of possible results into two parts only - the part above significance level logarithm, where is the zero hypothesis valid, and the part under the significance level logarithm, where the zero hypothesis should be rejected. But in row highlighting settings is at the P-value two predefined values, which means the interval of possible values is splitted into three parts, instead of two.

Moreover - at the row highlighting settings is speaking about P as about probability in percents. Then, if we are in default JAG3D settings speaking about 99,9 % probability (significance level 0,1 %) at the test statistic settings, makes no sense, why are the default values at the row highlighting settings setted to 1.0 and 5.0, instead of lets say, 95 and 99.9.

Then I want also to be clarified the behavior of highlighting. The P-value highlighting is not matching the exact value of logarithm of significance level, which is for 0,1 % (= 0,001) = -6,91, there are some rows highlighted regardless the condition log(p) > log(alpha). I guess this behavior is caused by adjustment of significance level alpha, using B-method, for each observation individually, therefore the exact value of log(alpha) is for individual observations varying from the base value -6.91, am I right?

Shortly spoken - I am asking for help, how to properly set the row highlighting in accordance to P-value, possible with brief explanation, how all mentioned values are connected together.

Big thanks for response and have a nice day!
Vlastimil

## P-value meaning + rows highlighting

Hello Vlastimil,

As you wrote in the handbook, the P-value is logarithm of probability, that the individual observation belongs to the base population. If this logarithm is lower, than logarithm of the significance level alpha, then the zero hypothesis should be rejected (= the observation did not belong to the base population/observations set). Is this correct, please?

Yes, that sounds right to me. You can strip the logarithm from the interpretation. If $p < \alpha$, the null hypothesis is rejected.

As far if it is correct, then there is not clear, how the rows highlighting in accordance to P-value works. The previous statement splits the interval of possible results into two parts only - the part above significance level logarithm, where is the zero hypothesis valid, and the part under the significance level logarithm, where the zero hypothesis should be rejected. But in row highlighting settings is at the P-value two predefined values, which means the interval of possible values is splitted into three parts, instead of two.

Yes, that is right. The boundary values are taken from the Cadastre (in Germany). Here, 1 % and 5 % are used. For this reason, I have defined these values as default - but you are free to choose your own values.

Then I want also to be clarified the behavior of highlighting. The P-value highlighting is not matching the exact value of logarithm of significance level, which is for 0,1 % (= 0,001) = -6,91, there are some rows highlighted regardless the condition log(p) > log(alpha).

This sounds like a bug and I have to reproduce this behavior by my own creating an example. The pie-charts in the HTML report seem to be correct. However, currently I'm at a measurement campaign and out of office. So, it will take a few days until I can check my source code.

I will come back to this thread as soon as possible.

/Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

## P-value meaning + rows highlighting

Hello Vlastimil,

As you wrote in the handbook, the P-value is logarithm of probability, that the individual observation belongs to the base population. If this logarithm is lower, than logarithm of the significance level alpha, then the zero hypothesis should be rejected (= the observation did not belong to the base population/observations set). Is this correct, please?

Yes, that sounds right to me. You can strip the logarithm from the interpretation. If $p < \alpha$, the null hypothesis is rejected.

Thanks for link, I have something to study. Based on the link - does JAG3D calculate the P-value for testing the Tprio/Tpost values?

As far if it is correct, then there is not clear, how the rows highlighting in accordance to P-value works. The previous statement splits the interval of possible results into two parts only - the part above significance level logarithm, where is the zero hypothesis valid, and the part under the significance level logarithm, where the zero hypothesis should be rejected. But in row highlighting settings is at the P-value two predefined values, which means the interval of possible values is splitted into three parts, instead of two.

Yes, that is right. The boundary values are taken from the Cadastre (in Germany). Here, 1 % and 5 % are used. For this reason, I have defined these values as default - but you are free to choose your own values.

Then the highlighting settings are mosty about significance level alpha, instead of the probability value in %. Isn't it there an misinterpretation of label? Shouldn't it be there in highlight settings written "significance level alpha in %" (respectively "Probability value alpha in %", as it is written in settings of test statistics), instead?

Then I want also to be clarified the behavior of highlighting. The P-value highlighting is not matching the exact value of logarithm of significance level, which is for 0,1 % (= 0,001) = -6,91, there are some rows highlighted regardless the condition log(p) > log(alpha).

This sounds like a bug and I have to reproduce this behavior by my own creating an example. The pie-charts in the HTML report seem to be correct. However, currently I'm at a measurement campaign and out of office. So, it will take a few days until I can check my source code.

As an illustration of this behavior am I attaching two screenshots I took. In the General tab are calculated logarithms of alpha for observations as -6,91. Row highlighting is setted for 0,1 and 1 %. Then the row with station 45A should be - if the highlighting have the similar logic, as T-values highlighting - red as well (the T-values highlighting marks row, where any of Tprio/Tpost overstep the critical value).

If I understand correctly and highlight settings of P-value should be setted according to significance level alpha, then should be:

Boundaries in %: 0 <= 0,1 <= 1 <= 100
Boundaries in absolute: 0 <= 0,001 <= 0,01 <= 1
Boundaries in logarithm: -inf <= -6,91 <= -4,61 <= 0

At mine case the ln(Pprio) = -10,70 for station 45A is in the first interval and should be marked in red color, instead of yellow.

Similar behavior can be spotted at station 1326 at the bottom part of picture - that row should be marked as red as well, since both values are below ln(0,001).

I am currently using version, marked as:
UI v20230316
DB v20230131
AC v20230228

Thanks!
Vlastimil

I will come back to this thread as soon as possible.

/Micha

## P-value meaning + rows highlighting - JAG3Dv20230904

Hello Vlastimil,

Based on the link - does JAG3D calculate the P-value for testing the Tprio/Tpost values?

JAG3D obtained the p-values from the test statistics Tprio and Tpost, respectively. There is a simple relation between both representations, i.e., T is the test statistic and the p-value is the corresponding probability. Both values are linked to each other. If you know T, you also know p and vice versa.

Then the highlighting settings are mosty about significance level alpha, instead of the probability value in %. Isn't it there an misinterpretation of label? Shouldn't it be there in highlight settings written "significance level alpha in %" (respectively "Probability value alpha in %", as it is written in settings of test statistics), instead?

The p-value is a probability value related to the type I error. However, if the row highlighting is enabled, the colors of the table rows are not related to your specified alpha defined in the test statistic dialog. To be more concrete: The p-values come from your data, the alpha is your sensibel selected (individual) threshold. For instance, there is no alpha for a specific distance but there is a test statistic T and a related probability p of T, and you are able to check p against your specified alpha. The boundaries in the row highlighting dialog are similar to the bins of a histogram.

As an illustration of this behavior am I attaching two screenshots I took.

Thank you. I checked my sources and (I think) I was able to figure out the problem. I've compiled a corrected version of JAG3D, which can be found here: JAG3Dv20230904.

Can you check this release and send me some feedback?

All the best
Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences

## P-value meaning + rows highlighting - JAG3Dv20230904

Hello Micha,

thanks for clarification of P-value meaning. I asked because I am trying to find the best workflow for net adjustment (I know about your recommendation to use T-values with combination with EP), which will suits mine quite specific data set. Now I hope, that I understood the issue, although it will still need some investigation and thinking.

Regarding the bug you found - I am attaching screenshot of the similar network with similar settings as I did before and everything seems as it should look like.

I am glad I was able to "contribute" and thanks for bugfix :)
Vlastimil

## P-value meaning + rows highlighting - JAG3Dv20230904

Hallo Vlastimil,

thanks for clarification of P-value meaning. I asked because I am trying to find the best workflow for net adjustment

You are welcome!

I am glad I was able to "contribute" and thanks for bugfix :)

Thank you for reporting and discussing problems/issues.

All the best
Micha

--
applied-geodesy.org - OpenSource Least-Squares Adjustment Software for Geodetic Sciences