MongoDB $redact to filter out some elements of an array -


i trying formulate query on sample bios collection http://docs.mongodb.org/manual/reference/bios-example-collection/:

retrieve persons , awards received before receiving turing award.

i have come query:

db.bios.aggregate([     {$match: {"awards.award" : "turing award"}},     {$project: {"award1": "$awards", "award2": "$awards", "first_name": "$name.first", "last_name": "$name.last"}},     {$unwind: "$award1"},     {$match: {"award1.award" : "turing award"}},     {$unwind: "$award2"},     {$redact: {         $cond: {            if: { $eq: [ { $gt: [ "$award1.year", "$award2.year"] }, true]},            then: "$$keep",            else: "$$prune"            }         }     } ]) 

and answer:

/* 0 */ {     "result" : [      {         "_id" : 1,         "award1" : {             "award" : "turing award",             "year" : 1977,             "by" : "acm"         },         "award2" : {             "award" : "w.w. mcdowell award",             "year" : 1967,             "by" : "ieee computer society"         },         "first_name" : "john",         "last_name" : "backus"     },      {         "_id" : 1,         "award1" : {             "award" : "turing award",             "year" : 1977,             "by" : "acm"         },         "award2" : {             "award" : "national medal of science",             "year" : 1975,             "by" : "national science foundation"         },         "first_name" : "john",         "last_name" : "backus"     },      {         "_id" : 4,         "award1" : {             "award" : "turing award",             "year" : 2001,             "by" : "acm"         },         "award2" : {             "award" : "rosing prize",             "year" : 1999,             "by" : "norwegian data association"         },         "first_name" : "kristen",         "last_name" : "nygaard"     },      {         "_id" : 5,         "award1" : {             "award" : "turing award",             "year" : 2001,             "by" : "acm"         },         "award2" : {             "award" : "rosing prize",             "year" : 1999,             "by" : "norwegian data association"         },         "first_name" : "ole-johan",         "last_name" : "dahl"     } ], "ok" : 1 } 

what don't solution unwind $award2. instead, happy keep award2 array, , remove awards received after award1. so, instance, answer john backus should be:

{     "_id" : 1,     "first_name" : "john",     "last_name" : "backus",     "award1" : {         "award" : "turing award",         "year" : 1977,         "by" : "acm"     },     "award2" : [          {             "award" : "w.w. mcdowell award",             "year" : 1967,             "by" : "ieee computer society"         },          {             "award" : "national medal of science",             "year" : 1975,             "by" : "national science foundation"         }     ] } 

is possible achieve $redact without doing $unwind: "$award2"?

it might have been little more helpful if had included original state of document example in question shows "where coming from" , "where want to" goal in addition desired output given.

that's tip, seems starting document this:

{     "_id" : 1,     "name": {          "first" : "john",         "last" : "backus"     },     "awards" : [         {             "award" : "w.w. mcdowell award",             "year" : 1967,             "by" : "ieee computer society"         },          {             "award" : "national medal of science",             "year" : 1975,             "by" : "national science foundation"         },         {              "award" : "turing award",             "year" : 1977,             "by" : "acm"         },         {             "award" : "some other award",             "year" : 1979,             "by" : "someone else"         }     ] } 

so real points here while may haved reached $redact here (and bit nicer using $project logical condition , using $match filter logical match ) isn't best tool comparison want here.

before moving on want point out main problem here $redact. whatever here logic ( without unwind ) compare "directly" on $$descend in order process the array elements on value of "year" @ whatever level.

that recursion going invalidate "award1" condition since has same field name. renaming field kills logic since projected value missing not greater tested value.

in nutshell, $redact ruled right out since cannot "take here only" logic applies.

the alternate use $map , $setdifference filter contents arrays follows:

db.bios.aggregate([     { "$match": { "awards.award": "turing award" } },     { "$project": {         "first_name": "$name.first",         "last_name": "$name.last",         "award1": { "$setdifference": [             { "$map": {                 "input": "$awards",                 "as": "a",                 "in": { "$cond": [                     { "$eq": [ "$$a.award", "turing award" ] },                     "$$a",                     false                 ]}             }},             [false]         ]},         "award2": { "$setdifference": [             { "$map": {                 "input": "$awards",                 "as": "a",                 "in": { "$cond": [                     { "$ne": [ "$$a.award", "turing award" ] },                     "$$a",                     false                 ]}             }},             [false]         ]}     }},     { "$unwind": "$award1" },     { "$project": {         "first_name": 1,         "last_name": 1,         "award1": 1,         "award2": { "$setdifference": [             { "$map": {                 "input": "$award2",                 "as": "a",                 "in": { "$cond": [                      { "$gt": [ "$award1.year", "$$a.year" ] },                      "$$a",                      false                  ]}             }},             [false]                     ]}     }} ]) 

and there no "pretty" way of getting around either usage of $unwind in itermediatary stage or second $project here, since $map ( , $setdifference filter ) returns "still array". $unwind necessary make "array" singular ( provided condition matches 1 element ) entry use in comparison.

trying "squish" logic in single $project result in "arrays of arrays" in second output, , still "unwinding" therefore required, @ least way unwinding (hopefully) 1 match not costly , keeps output clean.


but other thing note here not "aggregating" here @ all. document manipulation, might consider manipulation directly in client code. demonstrated shell example:

db.bios.find(     { "awards.award": "turing award" },     { "name": 1, "awards": 1 } ).foreach(function(doc) {     doc.first_name = doc.name.first;     doc.last_name = doc.name.last;     doc.award1 = doc.awards.filter(function(award) {         return award.award == "turing award"     })[0];     doc.award2 = doc.awards.filter(function(award) {         return doc.award1.year > award.year;     });     delete doc.name;     delete doc.awards;     printjson(doc); }) 

at rate, both approaches output same:

{     "_id" : 1,     "first_name" : "john",     "last_name" : "backus",     "award1" : {             "award" : "turing award",             "year" : 1977,             "by" : "acm"     },     "award2" : [             {                     "award" : "w.w. mcdowell award",                     "year" : 1967,                     "by" : "ieee computer society"             },             {                     "award" : "national medal of science",                     "year" : 1975,                     "by" : "national science foundation"             }     ] } 

the real difference here using .aggregate() content of "award2" filtered when returned server, isn't going different doing client processing approach unless items removed comprises reasonably large list per document.


for record, alteration existing aggregation pipeline required here add $group end "re-combine" array entries single document:

db.bios.aggregate([     { "$match": { "awards.award": "turing award" } },     { "$project": {         "first_name": "$name.first",          "last_name": "$name.last",         "award1": "$awards",         "award2": "$awards"     }},     { "$unwind": "$award1" },     { "$match": {"award1.award" : "turing award" }},     { "$unwind": "$award2" },     { "$redact": {         "$cond": {              "if": { "$gt": [ "$award1.year", "$award2.year"] },              "then": "$$keep",              "else": "$$prune"         }     }},     { "$group": {         "_id": "$_id",         "first_name": { "$first": "$first_name" },         "last_name": { "$first": "$last_name" },         "award1": { "$first": "$award1" },         "award2": { "$push": "$award2" }     }} ]) 

but again, there "array duplication" , "cost of unwind" associated operations here. either of first 2 approaches want in order avoid that.


Comments

Popular posts from this blog

How has firefox/gecko HTML+CSS rendering changed in version 38? -

android - CollapsingToolbarLayout: position the ExpandedText programmatically -

Listeners to visualise results of load test in JMeter -