Skip to content

Comments

Adds new tutorials, updates the existing#11

Closed
nermiller wants to merge 9 commits intoredis:mainfrom
nermiller:main
Closed

Adds new tutorials, updates the existing#11
nermiller wants to merge 9 commits intoredis:mainfrom
nermiller:main

Conversation

@nermiller
Copy link

No description provided.

@nermiller
Copy link
Author

@elena-kolevska, @ViktarStarastsenka, please review these tutorial updates.

@nermiller nermiller changed the title Adds new tutorials, updates the existing ones Adds new tutorials, updates the existing Jan 17, 2023
@elena-kolevska
Copy link

elena-kolevska commented Jan 18, 2023

This looks great to me, I just think that the t-digest page seems like a bit of an outlier. It doesn't follow the bikes use case and it has a lot of details about commands compared to the others. We have 4 more probabilistic data structures in RedisBloom that we're not mentioning, so as a user, I might wonder why are only Bloom and t-digest mentioned.

Also, it would be nice if we can create examples for the path-finding algorithms that use the same dataset as the rest of the tutorial, but I know that this can be tricky sometimes.


If you receive unhelpful queries from the clients to the database because no keys are matched in Redis, you can use a Bloom filter to filter such queries.

These guidelines help you learn how to use a Bloom filter to reduce heavy calls to the relational database or memory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if 'memory' should be used here. I assume the call wouldn't be heavy.


A Bloom filter is a probabilistic data structure that enables you to check if an element is present in a set using a very small memory space of a fixed size. It can guarantee the absence of an element from a set by giving a Boolean response about its presence. So, when it responds that an element is not present in a set ('false`), you can be sure that indeed is the case. However, false positive matches are possible. Controllable tradeoffs between accuracy and memory consumption from a Bloom filter are possible via the `error_rate` argument of the `BF.RESERVE` command.

If you receive unhelpful queries from the clients to the database because no keys are matched in Redis, you can use a Bloom filter to filter such queries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finding if a key exists in Redis is fast. We don't recommend creating a Bloom filter for this. (+ you also have the accurate information, so what add probabilistic on top of it).
Finding if a key exists in some external (disk-based) database, to avoid cache penetration for the external database, using Redis Bloom filter - does make sense.


These guidelines help you learn how to use a Bloom filter to reduce heavy calls to the relational database or memory.

## Avoiding cache penetration
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, the assumption is that there is some heavy database (not Redis) in the system, and we want to use RedisBloom to avoid cache penetration for that external database (not for Redis itself).

It is important to explain this deployment here and the problem that we are trying to solve in order to make this use case understandable.

In this specific example each query is client's IP address and information about this IP address is stored in some external (presumably slow) database. There are many queries with IP addresses that do not exist in the external database, and we want to reduce the number of queries to the external database by detecting non-existing IP addresses using RedisBloom and hence reducing the number of accesses to the external DB.


Because false-positive matches are possible with a Bloom filter (BF), you can use these options to better handle incoming requests.

### Store all valid keys in a BF upfront
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to clarify - 'keys' here are IP addresses = keys of the external database.

You can use graphs for highly interconnected data, like relationships between people, organizations, groups, documents, or places they have access to, and so on.

## Creating nodes
Suppose you want to track which bikes users bought so you can suggest them based on the fact that their friends have also bought them.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change users to owners?


```redis Create users
// Let's create some user nodes
GRAPH.QUERY bikes_graph 'CREATE (u:User { Name:"Andrea"})'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should change 'User' to 'Person'. That's the real entity type.


Model graph data:

- A user makes a transaction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, 'Person' instead of 'User'


## Use Bloom to check if username is free

Wonder how a Bloom filter can be used for your bike shop? For starters, you could keep a Bloom filter that stores all usernames of people who've already registered with your service. That way, when someone creates a new account, you can very quickly check if that username is free. If the answer is yes, you still have to go and check the main database for the precise result. But, if the answer is no, you can skip that call and continue with the registration.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't recommend using BF to check if something exist in Redis (in this case - if there is some node in RedisGraph). RedisGraph has the full information, so why use probabilistic data. Also, RedisGraph is stored in RAM so it should be fast enough for such qeries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants