-
Notifications
You must be signed in to change notification settings - Fork 9
Adds new tutorials, updates the existing #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
4f0801e
4f90be7
80ffdca
08cd8fe
7b49ed6
032552b
2779820
fe0556c
3bb968e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| RedisBloom is developed by Redis Inc., and adds probabilistic data structures, including a Bloom filter, to Redis. To install RedisBloom on top of an existing Redis, download and run [Redis Stack](https://redis.io/docs/stack/get-started/install/), which includes RedisBloom and other capabilities. Also check out [RedisBloom commands](https://redis.io/commands/?group=bf). | ||
|
|
||
| A Bloom filter is a probabilistic data structure that enables you to check if an element is present in a set using a very small memory space of a fixed size. It can guarantee the absence of an element from a set by giving a Boolean response about its presence. So, when it responds that an element is not present in a set ('false`), you can be sure that indeed is the case. However, false positive matches are possible. Controllable tradeoffs between accuracy and memory consumption from a Bloom filter are possible via the `error_rate` argument of the `BF.RESERVE` command. | ||
|
|
||
| If you receive unhelpful queries from the clients to the database because no keys are matched in Redis, you can use a Bloom filter to filter such queries. | ||
|
|
||
| These guidelines help you learn how to use a Bloom filter to reduce heavy calls to the relational database or memory. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if 'memory' should be used here. I assume the call wouldn't be heavy. |
||
|
|
||
| ## Avoiding cache penetration | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, the assumption is that there is some heavy database (not Redis) in the system, and we want to use RedisBloom to avoid cache penetration for that external database (not for Redis itself). It is important to explain this deployment here and the problem that we are trying to solve in order to make this use case understandable. In this specific example each query is client's IP address and information about this IP address is stored in some external (presumably slow) database. There are many queries with IP addresses that do not exist in the external database, and we want to reduce the number of queries to the external database by detecting non-existing IP addresses using RedisBloom and hence reducing the number of accesses to the external DB. |
||
|
|
||
| * Before checking the cache, implement some logic (e.g., IP range filtering). If the same unacceptable addresses are queried repeatedly, you may consider storing these addresses in Redis with an empty string value. | ||
| * If you need to store millions of invalid keys, indeed, you may consider using a Bloom filter. | ||
| * Create a Bloom filter using `BF.RESERVE` and add invalid addresses using `BF.ADD`. To determine if an invalid address has been seen before, use `BF.EXISTS`. The answer `1` means that, with high probability, the value has been seen before. An `0` means that it definitely wasn't seen before. | ||
|
|
||
| ## Handling incoming requests | ||
|
|
||
| Because false-positive matches are possible with a Bloom filter (BF), you can use these options to better handle incoming requests. | ||
|
|
||
| ### Store all valid keys in a BF upfront | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. just to clarify - 'keys' here are IP addresses = keys of the external database. |
||
|
|
||
| * Add all valid keys to the BF. | ||
| * When a request is received, search in the Bloom filter. | ||
| * If found in the BF, it is, with high probability, a valid key. Try to fetch it from the DB. If not found in the DB (low probability) it was a false positive. | ||
| * If not found in the BF, it is necessarily an invalid key. | ||
|
|
||
| ### Store valid keys in a BF on the fly | ||
|
|
||
| * When a request is received, search in the Bloom filter. | ||
| * If found in the BF, it is, with high probability, a valid key that was already seen. Try to fetch it from the DB. If not found in the DB (low probability) it was a false positive. | ||
| * If not found in the BF - it is either a first-time valid key or an invalid key. Check, and if valid - add to the BF. | ||
|
|
||
|
|
||
| ### Store invalid keys in a BF | ||
|
|
||
| * When a request is received, search in the Bloom filter. | ||
| * If found in the BF, it is, with high probability, an invalid key. Note that it may be a valid key (low probability) and you will ignore it, but that's a price you should be ready to pay if you go this way. | ||
| * If not found in the BF, it is either a valid key or a first-time invalid key. Check and, if invalid, add it to the BF. | ||
|
|
||
| ## Notes | ||
|
|
||
| * You don't need to add an item to a BF more than once. There is no benefit, but also no harm. | ||
| * You can't delete keys from a BF, but you can use a Cuckoo filter instead, which supports deletions but has some disadvantages compared to BF. RedisBloom supports Cuckoo filters as well. | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,10 +1,20 @@ | ||
| Redis Stack offers native graph capabilities. You can use graphs for highly interconnected data, like relationships between people, organisations, groups, documents or places they have access to and so on. | ||
| Redis Stack offers a labeled property graph data structure. The Labeled Property Graph data model is a modern generic NoSQL data model. | ||
| Basically, it utilizes the graph mathematical structure to represent and query data. | ||
| As a mathematical structure, a graph is a collection of vertices (also called nodes) and _edges_. As a data structure, in a labeled property graph, the graph vertices represent _entities_. | ||
| Entities are physical, conceptual, virtual, or fictional particulars or endurants, while the _graph edges_ represent relationships. | ||
| Each relationship is basically an association or an interaction between a pair of entities. | ||
| Each entity can have a set of labels, for example `Person`, `Police officer`, and `Bank Account`, and each relationship must have a type, for example `owns` or `member of`. | ||
| Each node and each relationship can also have a set of properties, where each property is a key-value pair. | ||
| For example, you can have a `name` property for a `Person` entity, or a `start date` property for an `owns` relationship. | ||
|
|
||
| For our shop, we would like to track which users have bought what so that we can suggest bikes based on the fact that their friends have also bought them. | ||
| You can use graphs for highly interconnected data, like relationships between people, organizations, groups, documents, or places they have access to, and so on. | ||
|
|
||
| ## Creating nodes | ||
| Suppose you want to track which bikes users bought so you can suggest them based on the fact that their friends have also bought them. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Change |
||
|
|
||
| ## Create nodes | ||
|
|
||
| This query creates a single bike node and sets its properties. | ||
|
|
||
| This query will create a single bike node and set its properties | ||
| ```redis Create a bike node | ||
| GRAPH.QUERY bikes_graph 'CREATE (b:Bike { | ||
| Brand:"Velorim", | ||
|
|
@@ -14,6 +24,8 @@ GRAPH.QUERY bikes_graph 'CREATE (b:Bike { | |
| RETURN b' | ||
| ``` | ||
|
|
||
| Now, load more bikes. | ||
|
|
||
| ```redis Load more bikes | ||
| // Let's load some more bikes | ||
| GRAPH.QUERY bikes_graph 'CREATE (b:Bike { Brand:"Bicyk", Model:"Hillcraft", Price:"1200", Type: "Kids Mountain Bikes" })' | ||
|
|
@@ -27,6 +39,8 @@ GRAPH.QUERY bikes_graph 'CREATE (b:Bike { Brand:"nHill", Model:"Summit", Price:" | |
| GRAPH.QUERY bikes_graph 'CREATE (b:Bike { Brand:"BikeShind", Model:"ThrillCycle", Price:"815", Type: "Commuter Bikes" })' | ||
| ``` | ||
|
|
||
| Let's create some users. | ||
|
|
||
| ```redis Create users | ||
| // Let's create some user nodes | ||
| GRAPH.QUERY bikes_graph 'CREATE (u:User { Name:"Andrea"})' | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe we should change 'User' to 'Person'. That's the real entity type. |
||
|
|
@@ -36,12 +50,15 @@ GRAPH.QUERY bikes_graph 'CREATE (u:User { Name:"Noah"})' | |
| GRAPH.QUERY bikes_graph 'CREATE (u:User { Name:"Mario"})' | ||
| ``` | ||
|
|
||
| ## Adding relationships | ||
| We model graph data very similarly to how we would describe it in a human language: | ||
| - A user makes a transaction | ||
| - That transaction contains a bike | ||
| We already have User and Bike nodes, we're only missing the Transactions, so let's create them. | ||
| We also need to establish the relationships between all the nodes; we do that by matching the existing nodes, saving them in a variable (b, u, t) and using that variable to create the relationships | ||
| ## Add relationships | ||
|
|
||
| Model graph data: | ||
|
|
||
| - A user makes a transaction. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again, 'Person' instead of 'User' |
||
| - That transaction contains a bike. | ||
|
|
||
| You already have `User` and `Bike` nodes. You're only missing transactions. Let's create them. | ||
| You also need to establish the relationships between all the nodes. Match the existing nodes, save them in a variable (b, u, t), and use that variable to create the relationships. | ||
|
|
||
| ```redis Model bike sales | ||
| GRAPH.QUERY bikes_graph ' | ||
|
|
@@ -52,13 +69,15 @@ GRAPH.QUERY bikes_graph ' | |
| ``` | ||
|
|
||
| Let's load some more relationships: | ||
|
|
||
| ```redis Load more bike sales | ||
| GRAPH.QUERY bikes_graph 'MATCH (b:Bike { Model: "Hillcraft"}), (u:User {Name: "Alicia"}) CREATE (t:Transaction {Value: 1200 }) CREATE (u)-[r1:MADE]->(t) CREATE (t)-[r2:CONTAINS]->(b)' | ||
| GRAPH.QUERY bikes_graph 'MATCH (b:Bike { Model: "ThrillCycle"}), (u:User {Name: "Andrea"}) CREATE (t:Transaction {Value: 815 }) CREATE (u)-[r1:MADE]->(t) CREATE (t)-[r2:CONTAINS]->(b)' | ||
| GRAPH.QUERY bikes_graph 'MATCH (b:Bike { Model: "XBN 2.1 Alloy"}), (u:User {Name: "Mathew"}) CREATE (t:Transaction {Value: 810 }) CREATE (u)-[r1:MADE]->(t) CREATE (t)-[r2:CONTAINS]->(b)' | ||
| ``` | ||
|
|
||
| Let's create a REVIEWED relationship between some users and bikes. The relationship will have a "Stars" property that will show the number of stars that the user assigned to the bike and a "ReviewID" property which will point us to the document that contains the review | ||
| Let's create a `REVIEWED` relationship between some users and bikes. The relationship has a `Stars` property that shows the number of stars that the user assigned to the bike and a `ReviewID` property that points you to the document that contains the review. | ||
|
|
||
| ```redis Model users reviewing bikes | ||
| GRAPH.QUERY bikes_graph ' | ||
| MATCH (u:User {Name: "Noah"}), | ||
|
|
@@ -70,7 +89,8 @@ GRAPH.QUERY bikes_graph 'MATCH (u:User {Name: "Mathew"}), (b:Bike { Model: "XBN | |
| GRAPH.QUERY bikes_graph 'MATCH (u:User {Name: "Mario"}), (b:Bike { Model: "Hillcraft"}) CREATE (u)-[r:REVIEWED {ReviewID: 123, Stars: 3}]->(b)' | ||
| ``` | ||
|
|
||
| Users of our bike shop will be able to follow each other so they can get updates on their recent updates | ||
| Users of the bike shop are able to follow each other so they can get updates on their recent activity. | ||
|
|
||
| ```redis Users can follow each other | ||
| GRAPH.QUERY bikes_graph 'MATCH (u1:User {Name: "Andrea"}), (u2:User {Name: "Noah"}) CREATE (u1)-[r:FOLLOWS]->(u2)' | ||
| GRAPH.QUERY bikes_graph 'MATCH (u1:User {Name: "Andrea"}), (u2:User {Name: "Alicia"}) CREATE (u1)-[r:FOLLOWS]->(u2)' | ||
|
|
@@ -79,9 +99,10 @@ GRAPH.QUERY bikes_graph 'MATCH (u1:User {Name: "Mathew"}), (u2:User {Name: "Mari | |
| GRAPH.QUERY bikes_graph 'MATCH (u1:User {Name: "Mario"}), (u2:User {Name: "Andrea"}) CREATE (u1)-[r:FOLLOWS]->(u2)' | ||
| ``` | ||
|
|
||
| ## Utilising the graph for discovering how data is related | ||
| When a user is viewing a page of a bike, we can increase the probability of a sale by showing the relationships that exist between the bike and the user, for example, someone our user follows might have bought the bike already, or might have reviewed it. | ||
| This is very easy to query with a graph database but very tricky with a relational database. | ||
| ## Use graph to discover how data is related | ||
|
|
||
| When a user accesses a bike page, you can increase the probability of a sale by showing the relationships that exist between the bike and the user. For example, someone your user follows might have already bought the bike or might have reviewed it. | ||
| This is very tricky to query with a relational database, but you can easily query it using a graph database. | ||
|
|
||
| ```redis Check user's connection with a bike | ||
| GRAPH.QUERY bikes_graph 'MATCH p=(u:User {Name: "Andrea"})-[r*1..5]->(b:Bike {Model: "Hillcraft"}) return p' | ||
|
|
@@ -93,4 +114,21 @@ GRAPH.QUERY bikes_graph 'MATCH p=(u1:User {Name: "Andrea"})-[f:FOLLOWS]->(u2:Use | |
|
|
||
| ```redis All users who I follow who reviewed this bike with more than 3 stars | ||
| GRAPH.QUERY bikes_graph 'MATCH p=(u1:User {Name: "Andrea"})-[f:FOLLOWS]->(u2:User)-[r:REVIEWED]->(b:Bike {Model: "Hillcraft"}) WHERE r.Stars>3 return p' | ||
| ``` | ||
| ``` | ||
|
|
||
| ## Use Bloom to check if username is free | ||
|
|
||
| Wonder how a Bloom filter can be used for your bike shop? For starters, you could keep a Bloom filter that stores all usernames of people who've already registered with your service. That way, when someone creates a new account, you can very quickly check if that username is free. If the answer is yes, you still have to go and check the main database for the precise result. But, if the answer is no, you can skip that call and continue with the registration. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We shouldn't recommend using BF to check if something exist in Redis (in this case - if there is some node in RedisGraph). RedisGraph has the full information, so why use probabilistic data. Also, RedisGraph is stored in RAM so it should be fast enough for such qeries. |
||
|
|
||
| Another, perhaps more interesting example, is showing better and more relevant ads to users. You can keep a Bloom filter per user with all the products they bought from the shop, and when you get a list of products from your suggestion engine, you can check it against this filter. | ||
|
|
||
| ```redis Add all bought product ids in the Bloom filter | ||
| BF.MADD user:778:bought_products 4545667 9026875 3178945 4848754 1242449 | ||
| ``` | ||
|
|
||
| Just before you try to show an ad to a user, you can first check if that product id is already in their `bought products` Bloom filter. If the answer is yes, you can choose to check the main database, or you might skip to the next recommendation from your list. But if the answer is no, then you know for sure that your user did not buy that product: | ||
|
|
||
| ```redis Has a user bought this product? | ||
| BF.EXISTS user:778:bought_products 1234567 // No, the user has not bought this product | ||
| BF.EXISTS user:778:bought_products 3178945 // The user might have bought this product | ||
| ``` | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finding if a key exists in Redis is fast. We don't recommend creating a Bloom filter for this. (+ you also have the accurate information, so what add probabilistic on top of it).
Finding if a key exists in some external (disk-based) database, to avoid cache penetration for the external database, using Redis Bloom filter - does make sense.