PHP Developers Network

A community of PHP developers offering assistance, advice, discussion, and friendship.
 
Loading
It is currently Thu Nov 23, 2017 1:36 pm

All times are UTC - 5 hours




Post new topic Reply to topic  [ 29 posts ]  Go to page Previous  1, 2
Author Message
PostPosted: Wed Aug 23, 2017 11:47 am 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6587
Location: WA, USA
hybris wrote:
Is this a better way than just INSERT ... ON DUPLICATE KEY UPDATE ?

It's what I do in my own code, so I think so.

hybris wrote:
If I'm going to use this function with an API is it not better to have 2 separate functions one for save() that returns an error like PRODUCT ALREADY EXIST if you try to add the same product and then another separate function for update that loads the info alredy in DB so you can change it.

Sounds like you're describing two different things: one is not allowing a new product if a duplicate exists, the other is updating an existing product. Using ON DUPLICATE KEY makes it harder to do both separately.

hybris wrote:
If not I fear this update function would be as 'bad' as a DELETE function since you just type in the product ID and it will update all fields with NULL?

Now I'm lost. Why are you setting fields to NULL? Don't do that.

hybris wrote:
I guess I could do some check if it was just one table but since the ingredient table is ID / NAME it will be impossible to know if the manufacturer removed an ingredient or if the user just forgot to type it in?

You can't do much to protect yourself against stupid users. If they forget to type in an ingredient, that's their fault - and there's no way for you to know. All you can do is hope that the user submitted the right information. Offering a sort of preview page before saving is an option, but surely they could also simply see the mistake in the new information and do a second update to fix it?


Top
 Profile  
 
PostPosted: Wed Aug 23, 2017 3:42 pm 
Offline
Forum Contributor

Joined: Wed Sep 25, 2013 4:09 am
Posts: 172
Quote:
Sounds like you're describing two different things: one is not allowing a new product if a duplicate exists, the other is updating an existing product. Using ON DUPLICATE KEY makes it harder to do both separately.


Yes I was describing 2 different things...

First thing I wondered was if

Syntax: [ Download ] [ Hide ]
public function save() {
                if ($this->id) {
                        /* update... */
                } else {
                        /* insert... */
                }
        }

is better than to use ON DUPLICATE KEY... To me it seems like the same functionality except in your function you can add more functionality (like do a usercheck() in the update function and only allow it for admins...or something). You can't add much stuff like that when using ON DUPLICATE KEY so I understand that makes it harder to do both separately..

The second thing I wondered was if I use that save() function with my api would it not be better to have 2 separate functions like create() and update() where the create function return "Error: Product already exist" if trying to add a product that already exist and then have a separate update function for updating a product.

To me that would make more sense.. say I have a product 1 where a user added tons of info like manufacturer info, nutrition and ingredients and then I have a second (lazy) user that tries to create product 1 but don't bother to fill out nutrition and manufacturer info. If it just updates the new update will erase lot of the old info.

I used your suggestion for ingredient and nutrition tables (id | ingredient)..
When I update those tables i first delete all info for the specific id and then add the new info since it can happen a producer drops an ingredient so i cant just find and change one of the existing ingredients (like Felix ketchup updates to heintz ketchup).

To have a separate function for update the user will have a prefilled form where he can change a value or delete a value. If the lazy user just want to change an ingredient he hopefully won't bother with erasing the nutrition info just to be mean..

Quote:
Now I'm lost. Why are you setting fields to NULL? Don't do that.


Why not?
I have lots of fields and I'm not sure all users bother to fill out everything all the time.. perhaps one user adds like nutrition and ingredients and then someone else bother to fill out manufacturer info at a later time. Also a field like "updated by" or one of the netweight or netvolume fields will be empty..
I guess instead of null i can have some standard char inserted like "-" if the field is not filled out but still if the lazy user don't add say manufacturer info when he tries to create a new product when this product already exist the old values will be overwritten with "-" instead of null...
Or am I missing something?

Quote:
You can't do much to protect yourself against stupid users. If they forget to type in an ingredient, that's their fault - and there's no way for you to know. All you can do is hope that the user submitted the right information. Offering a sort of preview page before saving is an option, but surely they could also simply see the mistake in the new information and do a second update to fix it?


Yeah I get that. If they miss an ingredient when adding a new product there is no way to know they missed an ingredient.

I was only discussing the save() function that contain both the create and update functionality in the same function.
If I have a separate update function I still can't protect me from evil users that intentionally deletes ingredients or enter wrong information. But a separate update function will atleast give the user prefilled fields so they have to actively erase information.

in the save() function if someone tries to add an existing product it will simply update if the product already exists but since there are no prefilled fields a lazy user may not fill out the nutrition info thus it will be erased.

It might be that I do the update function wrong but the table like
Syntax: [ Download ] [ Hide ]
product info
id * name * netweight * net vol * manufacturer * manufacturer email... and so on
 

I use Update - Set
but the table
Syntax: [ Download ] [ Hide ]
ingredients
id * ingredient
1    tomato
1    salt
1    horsemeat
2    egg
2    beef
.
. and so on
 

I just do DELETE id and then INSERT id


I guess in my API on POST to create a new product I can do a Product::get() check to see if the product exists and if so throw an exception from there instead of calling $this->save()

Sorry for all text but english is not my first language and huge thanks for all your help!


Top
 Profile  
 
PostPosted: Thu Aug 24, 2017 1:38 am 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6587
Location: WA, USA
hybris wrote:
First thing I wondered was if

Syntax: [ Download ] [ Hide ]
public function save() {
                if ($this->id) {
                        /* update... */
                } else {
                        /* insert... */
                }
        }

is better than to use ON DUPLICATE KEY... To me it seems like the same functionality except in your function you can add more functionality (like do a usercheck() in the update function and only allow it for admins...or something). You can't add much stuff like that when using ON DUPLICATE KEY so I understand that makes it harder to do both separately..

Right. Which is why it's better. INSERT...ON DUPLICATE KEY is only really suited for times when all the data being updated also establishes uniqueness. Like relationships between tables, where all the data (the two keys and maybe another field or two) form a unique key (changing either of them changes the meaning of the relationship); the relationship either exists or does not exist with no other states or forms. However if you started adding more data to the relationship, that supplements it without contributing to the exact nature of the relationship itself, then INSERT...ON DUPLICATE KEY becomes a problem again; the relationship does not simply exist or not exist anymore, but it can exist in multiple states/forms (according to the additional data).

hybris wrote:
The second thing I wondered was if I use that save() function with my api would it not be better to have 2 separate functions like create() and update() where the create function return "Error: Product already exist" if trying to add a product that already exist and then have a separate update function for updating a product.

It would be better.

An API allows for consumers (ie, the clients using the API) to perform actions on entities. You have two options for how the API works:

1. "Create" and "update" products
2. Somehow get product data and then "save" it

I'd rather not go into details about how the two work, but you can figure it out yourself too if you think about it. Suffice it to say, the first option is easier for you (better error options with less work involved), a bit more work for the consumer (they have to know which action to use), and better for the data (clear separation between what's new and what's existing).

Your API code should then reflect how the API works as a whole. Note I said "API code": your implementation may very well look like
Syntax: [ Download ] [ Hide ]
if (/* create */) {
        $product = Product::create($foo, $bar);
        $product->baz = $request["baz"];
        $product->save();
        return $product->id;
} else {
        $product = Product::get($request["id"]);
        $product->baz = $request["baz"];
        $product->save();
}
 

where you end up calling a "save" method, but that's on the product model - not the API. Different things.

hybris wrote:
To me that would make more sense.. say I have a product 1 where a user added tons of info like manufacturer info, nutrition and ingredients and then I have a second (lazy) user that tries to create product 1 but don't bother to fill out nutrition and manufacturer info. If it just updates the new update will erase lot of the old info.

If we say the save problem is resolved in favor of separate insert/update then this becomes a non-issue: the second user will not be able to create a product because it already exists.

hybris wrote:
I used your suggestion for ingredient and nutrition tables (id | ingredient)..
When I update those tables i first delete all info for the specific id and then add the new info since it can happen a producer drops an ingredient so i cant just find and change one of the existing ingredients (like Felix ketchup updates to heintz ketchup).

Adding and removing individual ingredients is possible, but it is easier to just work with the list as a whole. What you do in the database is up to you - DELETE all and then INSERT all, or SELECT and identify changes and DELETE/INSERT as needed (with the former being easier in code).

hybris wrote:
Quote:
Now I'm lost. Why are you setting fields to NULL? Don't do that.

Why not?
I have lots of fields and I'm not sure all users bother to fill out everything all the time.. perhaps one user adds like nutrition and ingredients and then someone else bother to fill out manufacturer info at a later time. Also a field like "updated by" or one of the netweight or netvolume fields will be empty..
I guess instead of null i can have some standard char inserted like "-" if the field is not filled out but still if the lazy user don't add say manufacturer info when he tries to create a new product when this product already exist the old values will be overwritten with "-" instead of null...
Or am I missing something?

No, I'm the one missing something. NULL for a lack of data is fine - definitely don't use a "-" or whatever.

hybris wrote:
I guess in my API on POST to create a new product I can do a Product::get() check to see if the product exists and if so throw an exception from there instead of calling $this->save()

When you break an action (like creating a new product) into more than one database operation (finding an existing product then saving the new one) you have to start worrying about concurrency. What if two users try to create the same product at the same time?

- Consumer A calls the API
- API A sees no product
- Consumer B calls the API
- API B sees no product
- API B creates the product
- API A creates the product

It may be unlikely but you should prepare for that.

There's two basic ways to deal with it:

1. An exclusive lock: lock the product table against all read/write operations performed by other connections, get existing data, then create the product and unlock the table. Anyone else attempting to access the product table will stall until the lock is released (which should just be milliseconds).
2. Don't break the action into parts. Kinda. Make the code attempt an INSERT IGNORE with all the product data, then check what happened: if successful then the product was indeed created and you can continue saving other data (ingredients, etc.), and if not then something happened and you can't create the new product. You can (and should) still try to get an existing product first, and that will cover 99% of cases, but you need the combined get/update which the INSERT does implicitly to know for sure.

#2 only works if there's a unique constraint on the product table that would interfere with creating a new product. The whatsit IDs you mentioned earlier sound appropriate, but a key on a name or something like that would not be.


Top
 Profile  
 
PostPosted: Thu Aug 24, 2017 3:16 am 
Offline
Forum Contributor

Joined: Wed Sep 25, 2013 4:09 am
Posts: 172
Quote:
What if two users try to create the same product at the same time?


Well I hired a shaman to perform a ritual involving lots of smoke and dance to make sure 2 users never tries to create the same product at the same time haha :)

I guess your way is more failsafe :)

But I actually used your structure in the save() function (update else insert) and make the "if already exists" check from the API so I dont know what will happen in your example

Quote:
- Consumer A calls the API
- API A sees no product
- Consumer B calls the API
- API B sees no product
- API B creates the product
- API A creates the product


Wouldn't it just be API A overwrites the API B then? Since NO API sees the product from the start then API B create the product and then API A creates the product?

If it occur at the exact same millisecond of a millisecond there might be some wierd error but wouldnt this be the case with 2 API trying to lock the same table at the exacly same milli second of a millisecond?

If I use the INSERT IGNORE command how do I catch the change if it returns no error if duplicate?
Shall I use mysqli_affected_rows() or is there a more "standard way" to check? (So I can determine if I shall go on and update the other tables ingredients and so on).

Again many thanks for helping me out! I learned tons from these boards and my code is way cleaner and better structured now than it was only a few weeks ago :)


Top
 Profile  
 
PostPosted: Thu Aug 24, 2017 5:35 am 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6587
Location: WA, USA
hybris wrote:
Well I hired a shaman to perform a ritual involving lots of smoke and dance to make sure 2 users never tries to create the same product at the same time haha :)

Shaman or not, don't rely on it. Because it will happen.

hybris wrote:
But I actually used your structure in the save() function (update else insert) and make the "if already exists" check from the API so I dont know what will happen in your example

hybris wrote:
Wouldn't it just be API A overwrites the API B then? Since NO API sees the product from the start then API B create the product and then API A creates the product?

Look at your code. If both APIs think the product doesn't exist then both will create new products. What will happen when both are saved?

hybris wrote:
If it occur at the exact same millisecond of a millisecond there might be some wierd error but wouldnt this be the case with 2 API trying to lock the same table at the exacly same milli second of a millisecond?

They'll "lock" the table, yeah, but the lock isn't the issue here.

hybris wrote:
If I use the INSERT IGNORE command how do I catch the change if it returns no error if duplicate?

You can tell by how many rows were inserted.


Top
 Profile  
 
PostPosted: Sat Aug 26, 2017 7:06 am 
Offline
Forum Contributor

Joined: Wed Sep 25, 2013 4:09 am
Posts: 172
Thanks :)

I think I got it to work now... (I havent had the chance of creating 2 products at the same time yet).

requinix wrote:
But searching ingredients may not be as easy as you think. If you want to search for "egg" then what about "eggs"? What about "egg yolk"? And on the other hand, should a search for "chicken" also match "chicken broth"?


Haha on so now I'm back to your first post in this thread heh..

I managed to make a function that returns matches in searches but only exact matches. Is it possible to use REGEX in queries?

I want to find for example *milk* OR *Milk* OR *MILK* in the ingredient list (actually before I write ingredients to the table I do a str_to_lower so Milk or MILK is very unlikely heh) but I have ingredients like milkprotein or milkpowder or powderedmilk or milk powder or powdered_milk....

or can i use FIND_IN_SET or WHERE (x like '%milk%' )?
Will it slow down the DB much?


Top
 Profile  
 
PostPosted: Sat Aug 26, 2017 7:40 am 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6587
Location: WA, USA
Yes, you can use regular expressions (in MySQL) but that doesn't solve the problem.

Say you've got "milk" and "Milk" (should match), "smilk" and "milke" (should not match), and "powdered milk" (should match... well, maybe, but let's say it should).

- Searching ="milk" will find the two good ones, not find the two bad ones, but not find the last
- Searching LIKE "%milk%" will find all the good ones but also all the bad ones
- Searching LIKE "milk%" or "%milk" will find one of the two bad ones

The only other one left to try is a regex:
- Searching REGEX "\bmilk\b" (\b is a word boundary) will find the two good ones, not find the two bad ones, and will find the last

Regex is slow but it works, right? No, because we simply haven't found a conflicting example.
- Searching REGEX "\begg\b" will find "egg" (good) but not find "eggs" (bad)

I'm thinking you
a) Tell the user that all ingredients must be normalized, meaning "eggs" must be "egg" instead. No plurals or other similar forms of words. You can't really enforce that automatically.
b) Force normalization but do it in code. This is a bit harder as you'll need a library that understands stemming in English - you can't simply remove a trailing "s" and the like.
c) Use another engine for searches - one that also understands stemming. That means doing searches through it, not MySQL.

(a) is a burden on the user and not reliable, and (c) is at least a burden on you because there's a lot more work that goes into it. I think (b) is your best option. If you want to output "6 eggs" given count=6 and ingredient=egg then you've got a bit of a problem, however the same library may be able to make words plural too.


Top
 Profile  
 
PostPosted: Sat Aug 26, 2017 8:22 am 
Offline
Moderator
User avatar

Joined: Tue Nov 09, 2010 3:39 pm
Posts: 6391
Location: Montreal, Canada
For c.), it might be worth taking a look at something like Elastic Search. It's not particularly difficult to set up and should help with what you're trying to accomplish.

_________________
Supported PHP versions No longer supported versions


Top
 Profile  
 
PostPosted: Sat Aug 26, 2017 3:06 pm 
Offline
Forum Contributor

Joined: Wed Sep 25, 2013 4:09 am
Posts: 172
Ouch,
this seems even harder than I thought from the start.

What Im searching is ingrediens from food labels so that gives me an option to refuse plurals.

I was thinking of having some friends help me add lots of product info. When I have some hundred+ different products I sort out doubles and save it as an ingredient table where i also add different names that means the same thing like citric acid - E330.
Then when adding new products i do AJAX call to that table so if someone try to add eggs they will get egg as suggestion. If they still try to type eggs instead they must add eggs as an new ingredient to the ingredient table and i can flagg the ingredient and product for manual inspektion. If its a legal new ingredient its cleared, if it is egg-eggs eggs can be linked with egg in the table...

Will be lots of controlling in the beginning bit after some 1000 products added it will be less and less work.
Maybe some thing like that is needed anyway since some ingredients can have multiple sales names which is hard to find through regular searchdictionaries..?


Top
 Profile  
 
PostPosted: Sun Aug 27, 2017 1:49 am 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6587
Location: WA, USA
More food for thought (heh): I have something here made with "white grape juice from concentrate".

If what you described for inputting ingredients works then that's the safest option: a human who knows the rules manually approves each new ingredient. Problem is it takes manpower to do that. You need staff to work on it.
You don't necessarily have to be so strict, though. The user could be forced to pick "egg" and then have an option to say "but the ingredients list actually says 'eggs'" - store the latter, link to the former. A human could still review it, but the optional extra step should help keep the data correct. Later when someone enters "egg" they only see the singular version, but if they also override with the plural then you can reuse that ingredient.

Personally I would still look into stemming. It won't fix all the potential problems, but it could at least assist with the most common issues; user enters "eggs" (new ingredient), stemmer recognizes it as a plural of "egg" (existing ingredient), system adds and links automatically.


Top
 Profile  
 
PostPosted: Tue Aug 29, 2017 5:58 pm 
Offline
Forum Contributor

Joined: Wed Sep 25, 2013 4:09 am
Posts: 172
requinix wrote:
The user could be forced to pick "egg" and then have an option to say "but the ingredients list actually says 'eggs'" - store the latter, link to the former.


Yeah thats a good idea. Thanks :)

I may check out stemming later but Im still a newb so I think I'll try to make it simple first and then upgrade when most pieces are in place.

I have another problem with SELECT like..

Syntax: [ Download ] [ Hide ]
$q="{$q}%";
   /* if ($stmt = $mysqli->prepare("SELECT
                EAN
                FROM Swe_Products WHERE EAN like '{$q}%' ")) { */

                if ($stmt = $mysqli->prepare("SELECT
                EAN
                FROM Swe_Products WHERE EAN like ? "
)) {
                $stmt->bind_param('i', $q);
                $stmt->execute();
                $stmt->bind_result($EAN[]);
                $stmt->store_result();
                $stmt->fetch();
                }


The greyed out part works as intended but I cant bind_param $q when I do it like that.
So I tried to do $q="{$q}%"; and SELECT like ? instead.. then the bind works but not the rest heh

(I have an EAN that starts with 7 so if I make $q=7 when i use the greyed out code $EAN[] = 76... which is ok. When I use the current code $EAN == "".

(Its part of an AJAX function)

If I try to bind with the greyed out code i get:
Suggestions: Warning: mysqli_stmt::bind_param(): Number of variables doesn't match number of parameters in prepared statement in /link/getEAN.php on line 78 763143604

??

EDIT: Before you tell me yes I know EAN[] will only contain 1 value.. I just copypasted the code to try it out and EAN must be array for the rest of the function.. :)


Top
 Profile  
 
PostPosted: Tue Aug 29, 2017 6:09 pm 
Offline
Forum Contributor

Joined: Wed Sep 25, 2013 4:09 am
Posts: 172
Syntax: [ Download ] [ Hide ]
if ($stmt = $mysqli->prepare("SELECT
                EAN
                FROM Swe_Products WHERE EAN like CONCAT(?,'%') "
)) {
                $stmt->bind_param('i', $q);
                $stmt->execute();
                $stmt->bind_result($EAN[]);
                $stmt->store_result();
                $stmt->fetch();
                }


Well I actually solved it :)


Top
 Profile  
 
PostPosted: Fri Sep 22, 2017 3:43 am 
Offline
Forum Contributor

Joined: Wed Sep 25, 2013 4:09 am
Posts: 172
Is it possible to populate my array from different tables inside the fetchloop?
The greyed out code doesnt work.

Syntax: [ Download ] [ Hide ]
if ($stmt = $mysqli->prepare("SELECT
    EAN
    ..
    .
   
    FROM table1 WHERE EAN like CONCAT(?,'%') LIMIT 5"
)) {
    $stmt->bind_param('i', $q);
    $stmt->execute();
    $stmt->bind_result($ean,  .. .);
    $stmt->store_result();
    $i=0;
    while($row=$stmt->fetch()){
        $result[$i][ean]=$ean;
        ..
        .
        /*
        if ($stmt = $mysqli->prepare("SELECT
                Table2.name,
                Table2.unit,
                Table3.amount
                FROM Table2, Table3
                WHERE Table2.EAN = ? AND Table2.list = Table3.id")){
                $stmt->bind_param('s', $data["EAN"]);
                $stmt->execute();
                $stmt->bind_result($name, $unit, $value);
                $stmt->store_result();
                while($row = $stmt->fetch()){
                    $result[$i]["nutrition_info"][$name][$value] = $unit;
                }
            }
            */

        $i++;
    }
}
 


Top
 Profile  
 
PostPosted: Fri Sep 22, 2017 4:24 am 
Offline
Spammer :|
User avatar

Joined: Wed Oct 15, 2008 2:35 am
Posts: 6587
Location: WA, USA
You can't reuse $stmt and $row in there - different variables should work. But it would probably be better to combine the two queries so that you only need one loop.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 29 posts ]  Go to page Previous  1, 2

All times are UTC - 5 hours


Who is online

Users browsing this forum: No registered users and 10 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB® Forum Software © phpBB Group