Alias behaviour in AFL


#1

Hi all,

Any known weirdness in the AFL alias behaviour?
In my mind, the following is correct, but I get an error.
(But I also see why this is… tricky…)

[In the following query, ‘muon’ & ‘electron’ are 2D arrays. ‘event’ is a dimension in each of them. I do count to aggregate and make each of them into a 1D array, and then join them.]

$ iquery -aq "
filter(
  join(
    count(
      filter(muon, mu_ptcone20 < 0.1 * mu_pt and mu_pt > 20000.0 and abs(mu_eta) < 2.4 and mu_isCombinedMuon <> 0 and mu_tight <> 0 and (mu_expectBLayerHit = 0 or mu_nBLHits <> 0) and (mu_nPixHits + mu_nPixelDeadSensors) > 1 and (mu_nSCTHits + mu_nSCTDeadSensors) >= 6 and (mu_nPixHoles + mu_nSCTHoles) < 3),
      event) as nmuons,
    count(
      filter(electron, el_ptcone20 < 0.1 * el_pt and el_pt > 20000.0 and abs(el_eta) < 2.5 and (el_author = 1 or el_author = 3) and el_medium <> 0),
    event) as nelectrons),
  (nmuons = 0 and nelectrons = 2) or (nmuons = 2 and nelectrons = 0))"

… but it doesn’t like it …

UserQueryException in file: src/query/parser/ALTranslator.cpp function: checkLogicalExpression line: 2912 Error id: scidb::SCIDB_SE_SYNTAX::SCIDB_LE_UNKNOWN_ATTRIBUTE_OR_DIMENSION Error description: Query syntax error. Attribute or dimension reference 'nmuons' does not exists. filter( join( count( filter(muon, mu_ptcone20 < 0.1 * mu_pt and mu_pt > 20000.0 and abs(mu_eta) < 2.4 and mu_isCombinedMuon <> 0 and mu_tight <> 0 and (mu_expectBLayerHit = 0 or mu_nBLHits <> 0) and (mu_nPixHits + mu_nPixelDeadSensors) > 1 and (mu_nSCTHits + mu_nSCTDeadSensors) >= 6 and (mu_nPixHoles + mu_nSCTHoles) < 3), event) as nmuons, count( filter(electron, el_ptcone20 < 0.1 * el_pt and el_pt > 20000.0 and abs(el_eta) < 2.5 and (el_author = 1 or el_author = 3) and el_medium <> 0), event) as nelectrons), (nmuons = 0 and nelectrons = 2) or (nmuons = 2 and nelectrons = 0)) ^^^^^^ Failed query id: 1101086280854

For reference here are the original schemas:

$ iquery -aq "show(muon)"
[("muon<mu_ptcone20:float,mu_pt:float,mu_eta:float,mu_isCombinedMuon:int32,mu_isLowPtReconstructedMuon:int32,mu_tight:int32,mu_expectBLayerHit:int32,mu_nBLHits:int32,mu_nPixHits:int32,mu_nPixelDeadSensors:int32,mu_nPixHoles:int32,mu_nSCTHits:int32,mu_nSCTDeadSensors:int32,mu_nSCTHoles:int32> [event(int32)=*,100000,0,muon(int32)=*,100000,0]")]

… and …

$ iquery -aq "show(electron)"
[("electron<el_ptcone20:float,el_pt:float,el_eta:float,el_author:int32,el_medium:int32> [event(int32)=*,100000,0,electron(int32)=*,100000,0]")]

#2

Hi.

There’s some confusion between alias for an array and an alias for an attribute.

When you do this:

count(
      filter(muon, mu_ptcone20 < 0.1 * mu_pt and mu_pt > 20000.0 and abs(mu_eta) < 2.4 and mu_isCombinedMuon <> 0 and mu_tight <> 0 and (mu_expectBLayerHit = 0 or mu_nBLHits <> 0) and (mu_nPixHits + mu_nPixelDeadSensors) > 1 and (mu_nSCTHits + mu_nSCTDeadSensors) >= 6 and (mu_nPixHoles + mu_nSCTHoles) < 3),
      event) as nmuons,

The “nmuons” becomes an alias for the entire array cell. That is fine and useful but the expression "nmuons = … " won’t work because “nmuons” is an array, not an attribute or dimension.

I think what you might want is

filter(
 join(
  aggregate(..., count(*) as nmuons) as A,
  aggregate(..., count(*) as nelectrons) as B
 ),
 (A.nmuons = 0 and B.nelectrons=2) or ...

Make sense? Helpful?


#3

But I also wonder if you are trying to perform an aggregate grouped by event, in which case you would say:

...
 join(
  aggregate(..., count(*) as nmuons, event) as A,
  aggregate(..., count(*) as nelectrons, event) as B
 ),
...

#4

Ok, I begin to see the problem.

The issue with

is that now counts of 0 become NULL, while in my initial attempt, I got the more convenient value of 0.

That in turns has a side effect when joining this array with others, because NULLs on either side will result in NULL for the output cell. In my initial attempt, the join will have the convenient value of 0 even after the join, which makes the filter and remaining query much easier to handle.

Does that make sense?

UPDATE: Had forgotten to close this one: what I missed is that the count operator returns an array with the field name ‘count’ hardcoded. So all I had to do was reference A.count and B.count and all worked fine.