0%

MongoDB查询总结

MongoDB查询总结


介绍

前面写过一篇关于Mongodb的例子——浅谈MongoDB数据库,当时使用的只是简单的查询,然后后面业务变的有点复杂,原先没有仔细研究过Mongodb的查询,以为就是简单调用下find就可以了,乃衣服。

所以今天特地举例说明一下Mongo中查询问题。

Mongo查询可以分为2种:

  • 普通查询,类似于Sql中的 select where

  • 聚合查询,类似于Sql中的 group by

普通查询

首先放一下官方文档,普通查询主要用到db.collection.find()函数。

定义下示例数据库,下面是是初始化数据,可以在Mongo中的控制台执行。

1
db.inventory.insertMany([
2
   { item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" },
3
   { item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "A" },
4
   { item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" },
5
   { item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" },
6
   { item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" }
7
]);
  • 查询所有
1
db.inventory.find( {} )

映射Sql语句

1
SELECT * FROM inventory
  • 条件查询

语法格式

1
{ <field1>: <value1>, ... }

比如查询statusD记录。

1
db.inventory.find( { status: "D" } )

映射Sql语句

1
SELECT * FROM inventory WHERE status = "D"
  • 使用操作符进行条件查询

语法格式

1
{ <field1>: { <operator1>: <value1> }, ... }

比如查询满足status是数组[A,D]中的记录

1
db.inventory.find( { status: { $in: [ "A", "D" ] } } )

映射Sql语句

1
SELECT * FROM inventory WHERE status in ("A", "D")
  • AND 条件查询

直接在find函数指定多个字段满足即可,这样就是 and 条件。

比如下面语句就是 statusAqty 小于 30

1
db.inventory.find( { status: "A", qty: { $lt: 30 } } )

映射Sql语句

1
SELECT * FROM inventory WHERE status = "A" AND qty < 30
  • OR 条件查询

OR 和 AND 就不一样了,需要用到操作符 $or,如下所示。

1
db.inventory.find( { $or: [ { status: "A" }, { qty: { $lt: 30 } } ] } )

类似于SQL中的

1
SELECT * FROM inventory WHERE status = "A" OR qty < 30
  • OR 和 AND 集合一起
1
db.inventory.find( {
2
     status: "A",
3
     $or: [ { qty: { $lt: 30 } }, { item: /^p/ } ]
4
} )

表示这样的意思。

1
SELECT * FROM inventory WHERE status = "A" AND ( qty < 30 OR item LIKE "p%")

查询举例

  • 查询全部
1
SELECT *
2
FROM people
1
db.people.find()
  • 指定字段
1
SELECT id,
2
       user_id,
3
       status
4
FROM people
1
db.people.find(
2
    { },
3
    { user_id: 1, status: 1 }
4
)
1
SELECT user_id, status
2
FROM people
  • 指定字段,不显示_id
1
db.people.find(
2
    { },
3
    { user_id: 1, status: 1, _id: 0 }
4
)
  • 条件查询全部
1
SELECT *
2
FROM people
3
WHERE status = "A"
1
db.people.find(
2
    { status: "A" }
3
)
  • 条件查询指定字段
1
SELECT user_id, status
2
FROM people
3
WHERE status = "A"
1
db.people.find(
2
    { status: "A" },
3
    { user_id: 1, status: 1, _id: 0 }
4
)
  • 条件查询不等于
1
SELECT *
2
FROM people
3
WHERE status != "A"
1
db.people.find(
2
    { status: { $ne: "A" } }
3
)
  • 条件查询 AND
1
SELECT *
2
FROM people
3
WHERE status = "A"
4
AND age = 50
1
db.people.find(
2
    { status: "A",
3
      age: 50 }
4
)
  • 条件查询 OR
1
SELECT *
2
FROM people
3
WHERE status = "A"
4
OR age = 50
1
db.people.find(
2
    { $or: [ { status: "A" } ,
3
             { age: 50 } ] }
4
)
  • 条件查询 >
1
SELECT *
2
FROM people
3
WHERE age > 25
1
db.people.find(
2
    { age: { $gt: 25 } }
3
)
  • 条件查询 <
1
SELECT *
2
FROM people
3
WHERE age < 25
1
db.people.find(
2
   { age: { $lt: 25 } }
3
)
  • 复杂的条件查询
1
SELECT *
2
FROM people
3
WHERE age > 25
4
AND   age <= 50
1
db.people.find(
2
   { age: { $gt: 25, $lte: 50 } }
3
)
  • 条件查询 LIKE
1
SELECT *
2
FROM people
3
WHERE user_id like "%bc%"
1
db.people.find( { user_id: /bc/ } )
2
3
// OR
4
5
db.people.find( { user_id: { $regex: /bc/ } } )
1
SELECT *
2
FROM people
3
WHERE user_id like "bc%"
1
db.people.find( { user_id: /^bc/ } )
2
3
// OR
4
5
db.people.find( { user_id: { $regex: /^bc/ } } )
  • 排序
1
SELECT *
2
FROM people
3
WHERE status = "A"
4
ORDER BY user_id ASC
1
db.people.find( { status: "A" } ).sort( { user_id: 1 } )
1
SELECT *
2
FROM people
3
WHERE status = "A"
4
ORDER BY user_id DESC
1
db.people.find( { status: "A" } ).sort( { user_id: -1 } )
  • 统计数量
1
SELECT COUNT(*)
2
FROM people
1
db.people.count()
2
3
// or
4
5
db.people.find().count()
1
SELECT COUNT(user_id)
2
FROM people
1
db.people.count( { user_id: { $exists: true } } )
2
or
1
db.people.find( { user_id: { $exists: true } } ).count()
1
SELECT COUNT(*)
2
FROM people
3
WHERE age > 30
1
db.people.count( { age: { $gt: 30 } } )
2
3
// or
4
5
db.people.find( { age: { $gt: 30 } } ).count()
  • 去除重复distinct
1
SELECT DISTINCT(status)
2
FROM people
1
db.people.distinct( "status" )
1
SELECT *
2
FROM people
3
LIMIT 1
  • 限制数量
1
db.people.findOne()
2
3
// or
4
5
db.people.find().limit(1)
1
SELECT *
2
FROM people
3
LIMIT 5
4
SKIP 10
1
db.people.find().limit(5).skip(10)
  • EXPLAIN
1
EXPLAIN SELECT *
2
FROM people
3
WHERE status = "A"
1
db.people.find( { status: "A" } ).explain()

聚合查询

上面普通查询使用find函数即可,但是聚合查询使用另外一个函数aggregate,这里是官方文档

初始化数据如下,有2个表 ordersorder_lineitem ,外键关联order_lineitem.order_id and the orders.id

1
{
2
  cust_id: "abc123",
3
  ord_date: ISODate("2012-11-02T17:04:11.102Z"),
4
  status: 'A',
5
  price: 50,
6
  items: [ { sku: "xxx", qty: 25, price: 1 },
7
           { sku: "yyy", qty: 25, price: 1 } ]
8
}
  • 统计数量
1
db.orders.aggregate( [
2
   {
3
     $group: {
4
        _id: null,
5
        count: { $sum: 1 }
6
     }
7
   }
8
] )

映射Sql语句

1
SELECT COUNT(*) AS count
2
FROM orders
  • 计算总和
1
db.orders.aggregate( [
2
   {
3
     $group: {
4
        _id: null,
5
        total: { $sum: "$price" }
6
     }
7
   }
8
] )

映射Sql语句

1
SELECT SUM(price) AS total
2
FROM orders
  • 分组计算总和
1
db.orders.aggregate( [
2
   {
3
     $group: {
4
        _id: "$cust_id",
5
        total: { $sum: "$price" }
6
     }
7
   }
8
] )

映射Sql语句

1
SELECT cust_id,
2
       SUM(price) AS total
3
FROM orders
4
GROUP BY cust_id
  • 分组计算总和并排序
1
db.orders.aggregate( [
2
   {
3
     $group: {
4
        _id: "$cust_id",
5
        total: { $sum: "$price" }
6
     }
7
   },
8
   { $sort: { total: 1 } }
9
] )

映射Sql语句

1
SELECT cust_id,
2
       SUM(price) AS total
3
FROM orders
4
GROUP BY cust_id
5
ORDER BY tota
  • 多个字段分组
1
db.orders.aggregate( [
2
   {
3
     $group: {
4
        _id: {
5
           cust_id: "$cust_id",
6
           ord_date: {
7
               month: { $month: "$ord_date" },
8
               day: { $dayOfMonth: "$ord_date" },
9
               year: { $year: "$ord_date"}
10
           }
11
        },
12
        total: { $sum: "$price" }
13
     }
14
   }
15
] )

映射Sql语句

1
SELECT cust_id,
2
       ord_date,
3
       SUM(price) AS total
4
FROM orders
5
GROUP BY cust_id,
6
         ord_date
  • 条件分组——HAVING
1
db.orders.aggregate( [
2
   {
3
     $group: {
4
        _id: "$cust_id",
5
        count: { $sum: 1 }
6
     }
7
   },
8
   { $match: { count: { $gt: 1 } } }
9
] )

映射Sql语句

1
SELECT cust_id,
2
       count(*)
3
FROM orders
4
GROUP BY cust_id
5
HAVING count(*) > 1
  • 复杂条件分组统计
1
db.orders.aggregate( [
2
   {
3
     $group: {
4
        _id: {
5
           cust_id: "$cust_id",
6
           ord_date: {
7
               month: { $month: "$ord_date" },
8
               day: { $dayOfMonth: "$ord_date" },
9
               year: { $year: "$ord_date"}
10
           }
11
        },
12
        total: { $sum: "$price" }
13
     }
14
   },
15
   { $match: { total: { $gt: 250 } } }
16
] )

映射Sql语句

1
SELECT cust_id,
2
       ord_date,
3
       SUM(price) AS total
4
FROM orders
5
GROUP BY cust_id,
6
         ord_date
7
HAVING total > 250
  • 复杂条件分组统计示例1
1
db.orders.aggregate( [
2
   { $match: { status: 'A' } },
3
   {
4
     $group: {
5
        _id: "$cust_id",
6
        total: { $sum: "$price" }
7
     }
8
   }
9
] )

映射Sql语句

1
SELECT cust_id,
2
       SUM(price) as total
3
FROM orders
4
WHERE status = 'A'
5
GROUP BY cust_id
  • 复杂条件分组统计示例2
1
db.orders.aggregate( [
2
   { $match: { status: 'A' } },
3
   {
4
     $group: {
5
        _id: "$cust_id",
6
        total: { $sum: "$price" }
7
     }
8
   },
9
   { $match: { total: { $gt: 250 } } }
10
] )

映射Sql语句

1
SELECT cust_id,
2
       SUM(price) as total
3
FROM orders
4
WHERE status = 'A'
5
GROUP BY cust_id
6
HAVING total > 250
  • 表关联
1
db.orders.aggregate( [
2
   { $unwind: "$items" },
3
   {
4
     $group: {
5
        _id: "$cust_id",
6
        qty: { $sum: "$items.qty" }
7
     }
8
   }
9
] )

映射Sql语句

1
SELECT cust_id,
2
       SUM(li.qty) as qty
3
FROM orders o,
4
     order_lineitem li
5
WHERE li.order_id = o.id
6
GROUP BY cust_id
  • 嵌套查询
1
db.orders.aggregate( [
2
   {
3
     $group: {
4
        _id: {
5
           cust_id: "$cust_id",
6
           ord_date: {
7
               month: { $month: "$ord_date" },
8
               day: { $dayOfMonth: "$ord_date" },
9
               year: { $year: "$ord_date"}
10
           }
11
        }
12
     }
13
   },
14
   {
15
     $group: {
16
        _id: null,
17
        count: { $sum: 1 }
18
     }
19
   }
20
] )

映射Sql语句

1
SELECT COUNT(*)
2
FROM (SELECT cust_id,
3
             ord_date
4
      FROM orders
5
      GROUP BY cust_id,
6
               ord_date)
7
      as DerivedTable

Map-Reduce

Mongo中聚合查询还有一种叫Map-Reduce,官方文档在这里,在思想上它跟Hadoop一样,从一个单一集合中输入数据,然后将结果输出到一个集合中。通常在使用类似SQL中Group By操作时,Map/Reduce会是一个好的工具。

Map-Reduce

接口方法定义

1
db.collection.mapReduce(
2
    <map>,
3
    <reduce>,
4
    {
5
        out: <collection>,
6
        query: <document>,
7
        sort: <document>,
8
        limit: <number>,
9
        finalize: <function>,
10
        scope: <document>,
11
        jsMode: <boolean>,
12
        verbose: <boolean>,
13
        bypassDocumentValidation: <boolean>
14
    }
15
)

参数说明

  • mapReduce: 要执行Map/Reduce集合的名字

  • map: map 函数 (下面会详细介绍)

  • reduce: reduce函数(下面会详细介绍)

  • out: 存放结果的集合 (下面会详细介绍)

  • query: 设置查询条件 <可选>

  • sort: 按某个键来排序 <可选>

  • limit: 指明从集合检索文档个数的最大值 <可选>

  • finalize: 对reduce结果做进一步处理 <可选>

  • scope: 指明通过map/reduce/finalize可以访问到的变量 <可选>

  • jsMode: 指明Map/Reduce执行过程中文档保持JSON状态 <可选>

  • verbose: 提供关于任务执行的统计数据 <可选>

示例说明

举例说明Map-Reduce的用途,虽然代码比较多,也行用上面的聚合查询,一下子就搞定了,但是这里只是举例。

比如有个订单表,如下所示,我们需要计算每个人的订单总价。

1
{
2
     _id: ObjectId("50a8240b927d5d8b5891743c"),
3
     cust_id: "abc123",
4
     ord_date: new Date("Oct 04, 2012"),
5
     status: 'A',
6
     price: 25,
7
     items: [ { sku: "mmm", qty: 5, price: 2.5 },
8
              { sku: "nnn", qty: 5, price: 2.5 } ]
9
}

首先定义Map方法,就说我们后面的聚合计算需要哪些字段,由于需要计算每个人的订单总结,那么个人信息和加个肯定是我们需要的。

1
var mapFunction1 = function() {
2
    emit(this.cust_id, this.price);
3
};

然后定义reduce方法,计算每个人的订单价格。

1
var reduceFunction1 = function(keyCustId, valuesPrices) {
2
    return Array.sum(valuesPrices);
3
};

然后存储最后的计算结果。

1
db.orders.mapReduce(
2
    mapFunction1,
3
    reduceFunction1,
4
    { out: "map_reduce_example" }
5
)

这样一个简单的Map-Reduce实例就完成了,结果放在map_reduce_example中。

上面示例比较简单,那么我们来一个复杂一点的例子。

一条订单记录中,有sdk的名称、数量、价格,那么要查询出日期大于01/01/2012,所有订单的总数,以及平均sdk价格。

首先还是定义个map函数。

1
var mapFunction2 = function() {
2
    for (var idx = 0; idx < this.items.length; idx++) {
3
        var key = this.items[idx].sku;
4
        var value = {
5
                        count: 1,
6
                        qty: this.items[idx].qty
7
                    };
8
        emit(key, value);
9
    }
10
};

然后算出sku的数量,和总价格。

1
var reduceFunction2 = function(keySKU, countObjVals) {
2
    reducedVal = { count: 0, qty: 0 };
3
4
    for (var idx = 0; idx < countObjVals.length; idx++) {
5
        reducedVal.count += countObjVals[idx].count;
6
        reducedVal.qty += countObjVals[idx].qty;
7
    }
8
9
    return reducedVal;
10
};

总价格出来后,还要计算出平均价格。

1
var finalizeFunction2 = function (key, reducedVal) {
2
    reducedVal.avg = reducedVal.qty / reducedVal.count;
3
    return reducedVal;
4
};

还有日期的条件过滤,最后得出完整的map-reduce。

1
db.orders.mapReduce(
2
    mapFunction2,
3
    reduceFunction2,
4
    {
5
        out: { merge: "map_reduce_example" },
6
        query: {
7
            ord_date:{ $gt: new Date('01/01/2012') }
8
        },
9
        finalize: finalizeFunction2
10
    }
11
)

总结

以上就是我对MongoDB的示例总结,本人是一个初学者,也有很多地方不懂,如果有错误的地方,欢迎指出。

相关资料

浅谈MongoDB数据库

普通查询官方文档

Sql和Mongo隐射表

聚合官方文档

Map-Reduce官方文档

Map-Reduce API

坚持原创技术分享,您的支持将鼓励我继续创作!